The New Wonderland Archive

Discuss the games (no level solutions or off-topic, please).

Moderators: ~xpr'd~, tyteen4a03, Stinky, Emerald141, Qloof234, jdl

User avatar
llarson
Rainbow Star
Posts: 1438
Joined: Sun May 20, 2007 11:36 pm

Re: HTTrack project 20GB

Post by llarson » Sat Aug 18, 2012 5:08 am

LexieTheFox wrote:
VirtLands wrote:The following attachment is a sorted list of member names, ID's, emails,
compiled from 241 member webpages. Click on the attached download.

Enjoy. Image Image Image
Ok um... I'm not ok with my ID or Email being given out on that mirror website. Please, when its up. Display my name ONLY.
Agreed :| Except only showing my username.
"Cabbage is useless"~Came to me in a dream
"Aparagus is Unlovable"~Common knowledge
"Radishes are Ugly"~Came to my mind
My Level List
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Re: HTTrack project 20GB

Post by tyteen4a03 » Sat Aug 18, 2012 5:22 am

LexieTheFox wrote:
VirtLands wrote:The following attachment is a sorted list of member names, ID's, emails,
compiled from 241 member webpages. Click on the attached download.

Enjoy. Image Image Image
Ok um... I'm not ok with my ID or Email being given out on that mirror website. Please, when its up. Display my name ONLY.
Firstly, the informations exposed by VirtLands has nothing to do with me. Secondly, if you choose to publicly display your email address the bot will grab them, but they will never be shown in the public. If you did not, the bot will not attempt to grab them.

(For anybody that is worried about the exposure of emails in the Off-Topic area - the bot is never configured to access the Off-Topic area in any way, nor is there code to scrap information only accessible by a moderator)
and the duck went moo

Beep bloop
User avatar
Wonderman109
Rainbow MegaStar
Posts: 3593
Joined: Thu Jun 28, 2012 11:25 pm

Post by Wonderman109 » Sat Aug 18, 2012 2:07 pm

I don't want anything but my username being given away either, please. :| :!: :o
Not really around much these years.
User avatar
jdl
Rainbow SuperStar
Posts: 2894
Joined: Fri Jun 06, 2008 8:37 pm
Location: West Virginia, USA
Contact:

Post by jdl » Sat Aug 18, 2012 3:30 pm

Guys, your ID/Email/other personal stuff isn't going to be given away. :?
ImageImage
TheCracksOverhead#9565 | Oops, uh oh.
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Sat Aug 18, 2012 4:59 pm

Wonderman109 wrote:I don't want anything but my username being given away either, please. :| :!: :o
If you did not give away your personal information publicly, the bot won't be able to get them.
and the duck went moo

Beep bloop
User avatar
VirtLands
Rainbow Master
Posts: 756
Joined: Thu Dec 29, 2005 1:49 am

emails list

Post by VirtLands » Sat Aug 18, 2012 5:58 pm

I have removed the PCPuzzle Member's List.TXT attachment from page 1.

Wonderman109 wrote" Ị̤̣ d̤̣̣̣̣o̤̤̣̤n̤̤̤̤̤̣'̤̣̣̤̣̣ṭ̣̣̤̣̤ ẉ̤̤̤̤̣ạ̣̤n̤̣̣̣t̤̤̣̤̤ ạ̤̣̣̤ṇ̣̤̣y̤̤̣̣t̤̣̣h̤̣̤i̤̤̤̤ṇ̤̣̣̤g̣̣̤̤̤̣ b̤̣̤̤̣̤ṳ̤̣̤̣̤t̤̤̤̤̣̤ m̤̣̣̣y̤̣̤ ṳ̤̣̣s̤̤̤̣ẹ̤̣̣̣ṛ̣̣̤̣n̤̤̣̣̣̣ạ̣̤̤̤̤m̤̤̣e̤̤̣ ḅ̤̣̣̤ẹ̣̣̤̤ị̣̤̣ṇ̣̤̤̤g̤̤̤̣̤̤ g̤̤̤̣i̤̣̣ṿ̣̤ẹ̣̣n̤̣̤̤̤ a̤̤̣̤̣ẉ̤̣ạ̣̣̤y̤̤̣̤̣̣ ẹ̣̤ị̣̣t̤̤̣̣ḥ̣̣̤ẹ̤̤̤r̤̤̤̣̣,̤̣̣̣̤̤ p̣̤̣̣̤̣l̤̤̤̣ẹ̣̣̤̣̤a̤̤̣̣̣s̤̤̤̣̤e̤̤̤̤̤̤.̣̣̣ Image


Image Image Image
Last edited by VirtLands on Thu Aug 23, 2012 5:37 am, edited 2 times in total.
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Sat Aug 18, 2012 7:08 pm

That is what I mean by "public information".

EDIT: I am happy to announce that the spider is finished. Now I am testing the spider then moving on to the Pipeline (the medium that stores things into database). The trickiest part of the Pipeline would be the html2bbcode function, which turns raw HTML back to BBcode.
and the duck went moo

Beep bloop
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Wed Aug 22, 2012 10:37 am

Debugging took more time than I thought it would. It seems like I understood some of the concepts wrongly, so now I have to redo some bits of scraping.

(And as part of the test, I now have almost everything shown to the public)
and the duck went moo

Beep bloop
User avatar
VirtLands
Rainbow Master
Posts: 756
Joined: Thu Dec 29, 2005 1:49 am

Post by VirtLands » Thu Aug 23, 2012 5:45 am

Great progresss.

I would be glad to help in it except that I don't understand any of
that python-ish stuff.
I have Python 3.2 installed, but I never use it.

Send us a screenshot, or data sample. ;)
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Tue Sep 04, 2012 5:01 am

Nothing much has happened over the time, but the scraping bot is coming into shape. Chris (christ) has offered to work together for the website, his experience in PHP scraping will definitely come in handy.

I now have school and have to apply to university (eek), so I won't be able to work on this as much as I could in the summer holidays. Hopefully I have the willpower to keep this project alive... :eek:
and the duck went moo

Beep bloop
User avatar
VirtLands
Rainbow Master
Posts: 756
Joined: Thu Dec 29, 2005 1:49 am

os.fork

Post by VirtLands » Tue Sep 04, 2012 5:29 am

Wherever you are, may the os.fork be with you.

Looks likfe I'll have to pick up where he (Tyteen) left off.

This isn't going to be easy. Image

While you're in university you can help us, Image right?
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Tue Sep 04, 2012 9:34 am

I'm aiming for MIT, and you know how people there like to work day after night after day.

So I'm not sure.
and the duck went moo

Beep bloop
User avatar
VirtLands
Rainbow Master
Posts: 756
Joined: Thu Dec 29, 2005 1:49 am

MIT

Post by VirtLands » Tue Sep 04, 2012 7:01 pm

I see. Congratulations on this plan.
Last edited by VirtLands on Tue Sep 18, 2012 4:14 am, edited 1 time in total.
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Mon Sep 17, 2012 6:46 pm

Development has halted because I had a bit of school work and unexpected projects coming in.

It will restart in... *insert restart time here*
and the duck went moo

Beep bloop
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Tue Nov 20, 2012 2:15 am

Because the development isn't going anywhere, I'm pushing the code on GitHub for everybody to see. Hopefully I'll have more time in Christmas break.

Note: This code is broken.

https://github.com/tyteen4a03/wlf_scrapy
and the duck went moo

Beep bloop
User avatar
VirtLands
Rainbow Master
Posts: 756
Joined: Thu Dec 29, 2005 1:49 am

Note: This code is broken

Post by VirtLands » Tue Nov 20, 2012 8:27 pm

tyteen4a03 wrote:...Note: This code is broken....
So, is it broken as in incomplete, or broken as in shattered ? Image
---------------------------------------------------

Well, this part was interesting anyway: Image
sample code from TyTeen's items.py:

class User(Item):
"""
A user.
"""
userID = Field()
username = Field()
joinDate = Field()
totalPosts = Field()
avatarName = Field()
location = Field()
website = Field()
occupation = Field()
interests = Field()
email = Field()
msn = Field()
aim = Field()
yahoo = Field()
icq = Field()
signature = Field()

Rest in peace, code... Image
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Tue Nov 20, 2012 9:28 pm

Incomplete. Sorry if I wasn't clear enough :P

A major reason why it's incomplete comes from pipeline.py. I haven't finished the htmlToBBCode method yet (and it's hacky, I tell ya). Otherwise, the code runs (with a lot of exceptions yet to be fixed)
and the duck went moo

Beep bloop
User avatar
jdl
Rainbow SuperStar
Posts: 2894
Joined: Fri Jun 06, 2008 8:37 pm
Location: West Virginia, USA
Contact:

Post by jdl » Fri Jun 07, 2013 4:00 pm

So is this still a thing, Tyteen?

How's everything been doing the past few months? :)
ImageImage
TheCracksOverhead#9565 | Oops, uh oh.
User avatar
myuacc1studios
Rainbow Wizard
Posts: 486
Joined: Tue Apr 30, 2013 6:32 pm
Location: San Jose, CA
Contact:

Post by myuacc1studios » Mon Jun 24, 2013 8:56 pm

Technos72 wrote:My answer to good idea:
Image
AGH TOO MUCH PONY AHHHH
Image
Image
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Sat Jul 20, 2013 3:33 pm

jdl wrote:So is this still a thing, Tyteen?

How's everything been doing the past few months? :)
Nothing has really happened in the past few months, I had freelance projects writing XenForo addons, exams and a very big video project (think full-length movies). Now I'm finishing a music video that is very important to me, then I'm moving to Denmark for a year of foreign exchange.

My coding skills has grew a lot since I started developing XenForo addons, and I can see the new Wonderland Archive being based on XenForo's Resource Manager addon. It will (as always) be a lot, lot more easier if Patrick would switch to XenForo and give me direct access to the database for post conversion, but it's not happening soon, if ever.

This project will be out of hiatus somewhat soon, but this project is low on my priority list - I need to start coding for profit so I can make a living. As for when the hiatus will be over, I honestly don't know. Feel free to take the current WLF Scraping code and expand it into a full bot and scrap the forum - with data, this will all be a piece of cake.
and the duck went moo

Beep bloop
Muzozavr
Rainbow Spirit Chaser
Posts: 5648
Joined: Wed Jan 11, 2006 2:55 pm

Post by Muzozavr » Sat Jul 20, 2013 4:11 pm

tyteen4a03 wrote:a very big video project (think full-length movies
*twitch*

As a wannabe filmmaker (who doesn't have a camera nor the money for it, nor have I written a finished script) I'm highly interested in knowing what the project is and what your role in it was.
Rest in peace, Kym. I hardly knew ya.
Rest in peace, Marinus. A bright star, you were ahead of me on my own tracks of thought. I miss you.
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Sat Jul 20, 2013 4:39 pm

Muzozavr wrote:
tyteen4a03 wrote:a very big video project (think full-length movies
*twitch*

As a wannabe filmmaker (who doesn't have a camera nor the money for it, nor have I written a finished script) I'm highly interested in knowing what the project is and what your role in it was.
Everything. It's not even a movie movie - just a lot of photos plus heaps of funny content of my 100 friends at school put together in a Ponies - The Anthology way. It seems to be a success - my friends enjoyed it a lot.

However, my Music Video coming up will be a real professional production - stay tuned. :)
and the duck went moo

Beep bloop
User avatar
tyteen4a03
Rainbow AllStar
Posts: 4386
Joined: Wed Jul 12, 2006 7:16 am
Contact:

Post by tyteen4a03 » Sun Jul 21, 2013 5:06 pm

A lot has happened in the past 48 hours and I'm sad to announce the MV will need to be postponed at least a year later. However, this will mean I will have a little bit more free time working on this.

I've shot MS an offer of my spare XenForo license - now patiently waiting for a response. If MS gives switching to XenForo a green light, I'll bump the priority of this project, and incorporate an API for future Wonderland games to directly upload levels and adventures. If not, I'll continue my work on the scraping bot - it's quite near completion as I plan on not scraping some very outdated data (like IM screen names - who uses any of the listed IMs in the Profile anymore?) and slimming down the functionality (like built-in html to BBcode conversion - these HTML mess I can fix by hand).
and the duck went moo

Beep bloop
Post Reply