Topic: Text files (Read 6005 times)

EddieTea · « **on:** June 11, 2013, 01:36:59 PM »

Hey everyone, firstly well done on creating an amazing collection of important literature.

I'm a post-grad linguistics researcher at the University of Wales, Swansea, specialising in children's literature. I'm currently building a corpus of 'classic' children's stories, and using Project Gutenberg to download texts (eg Treasure Island, Peter Pan, Little Women, Adventures of Mark Twain). As the period of publication is around 1860 - 1920, there are no copyright issues in downloading the texts and using them for linguistic analysis.

I'm particularly interested in collections such as the 'Half Dime Library' you have on your site, published during the same period.

Is there a way of extracting the text, or downloading rich text files (RTF), without the illustrations?

mr_goldenage · « **Reply #1 on:** June 11, 2013, 03:49:37 PM »

Is there a RTF extrac tor like there is for PDF files that are extracted without the text? Just a thought.

RB @ Work

SuperScrounge · « **Reply #2 on:** June 12, 2013, 07:38:50 AM »

Aren't those image files? (Either gif, jpg or png.)

If so I don't believe you could extract text as there is no real text, just pixels creating the illusion of text.

What you would need is an Optical Character Scanner program which could read the image and create a text document.

Or ask the scanner to rescan the pages you want with an OCR.

dcburtonjr · « **Reply #3 on:** June 27, 2013, 04:23:22 AM »

Hi,
I'm new here and I also want to be able to pick out a particular panel and be able to use it on my website and/or blog. How can I do that? I'm using Windows 8 which isn't compatible with some programs. Any suggestions?
Thanks.

SuperScrounge · « **Reply #4 on:** June 27, 2013, 05:48:08 AM »

Once you've downloaded the file use an image editing program to open the page you want and Copy the panel(s) you want.

Or you could create a duplicate of the page and use Crop. (Don't use crop on the original if you wish to reread the rest of the page.)

Yoc · « **Reply #5 on:** June 27, 2013, 03:22:35 PM »

I'm not sure if Windows 8 comes with image editing software.
If not you might grab a free one like Irfanview which can crop and save most image formats.

Don't forget to credit the original scanner and the site when your repost.

Good luck,
-Yoc

jimmm kelly · « **Reply #6 on:** June 27, 2013, 03:39:53 PM »

Wordpress allows you to crop any images. A fact I didn't realize when I first started blogging on it. Saves me some time, if I haven't already cropped the image. My usual practice is to crop images in paint--including some images I've saved from this site. I don't know how to sharpen the images--I guess you need photoshop for that, which I don't have.

narfstar · « **Reply #7 on:** June 27, 2013, 04:39:32 PM »

Irfanview allows sharpen as does free paint .net

Yoc · « **Reply #8 on:** June 27, 2013, 10:59:01 PM »

If you download Irfanview and the available plug-ins from them you can use Loss-less jpg editing which might cut down on the blurring associated with editing jpgs. Irfanview is even able to use some Photoshop plugins out there.

-Yoc
Yes, I use Irfanview a lot

Comix · « **Reply #9 on:** August 03, 2013, 04:56:12 PM »

Hi there,
I

SammiD · « **Reply #10 on:** September 04, 2016, 12:58:28 PM »

I use Firefox with the "DownThemAll!" add-on. I haven't had any time-out problems; you might want to try it.

	Trig Matson 1 26 pages

	Lucky 'magpie' 52 pages

	The Sledgehammer 67 pages

	Fulgor 7 The Steel City 36 pages

	The Second Pimpernel 1 28 pages

	Girls' Crystal 662 12 pages

	Bob Steele Western 8 36 pages

	Curly Kayoe 8 36 pages

	Curly Kayoe 6 36 pages

	Curly Kayoe 5 36 pages

	Curly Kayoe 1 36 pages

	United Presents 36 pages

	Tim Valour 23 1957-06 28 pages

	Gem Comics 4 1947-02 36 pages

	Fritzi Ritz 4 36 pages

Text files

Author Topic: Text files (Read 6005 times)

EddieTea

Text files

mr_goldenage

Re: Text files

SuperScrounge

Re: Text files

dcburtonjr

Re: Text files

SuperScrounge

Re: Text files

Yoc

Re: Text files

jimmm kelly

Re: Text files

narfstar

Re: Text files

Yoc

Re: Text files

Comix

Re: Text files

SammiD

Re: Text files

Username:
Password: