Comic Book Plus Forum
News, Rules And Introductions => Basic Site Rules => Topic started by: EddieTea on June 11, 2013, 01:36:59 PM
-
Hey everyone, firstly well done on creating an amazing collection of important literature.
I'm a post-grad linguistics researcher at the University of Wales, Swansea, specialising in children's literature. I'm currently building a corpus of 'classic' children's stories, and using Project Gutenberg to download texts (eg Treasure Island, Peter Pan, Little Women, Adventures of Mark Twain). As the period of publication is around 1860 - 1920, there are no copyright issues in downloading the texts and using them for linguistic analysis.
I'm particularly interested in collections such as the 'Half Dime Library' you have on your site, published during the same period.
Is there a way of extracting the text, or downloading rich text files (RTF), without the illustrations?
-
Is there a RTF extrac tor like there is for PDF files that are extracted without the text? Just a thought.
RB @ Work
-
Aren't those image files? (Either gif, jpg or png.)
If so I don't believe you could extract text as there is no real text, just pixels creating the illusion of text.
What you would need is an Optical Character Scanner program which could read the image and create a text document.
Or ask the scanner to rescan the pages you want with an OCR.
-
Hi,
I'm new here and I also want to be able to pick out a particular panel and be able to use it on my website and/or blog. How can I do that? I'm using Windows 8 which isn't compatible with some programs. Any suggestions?
Thanks.
-
Once you've downloaded the file use an image editing program to open the page you want and Copy the panel(s) you want.
Or you could create a duplicate of the page and use Crop. (Don't use crop on the original if you wish to reread the rest of the page.)
-
I'm not sure if Windows 8 comes with image editing software.
If not you might grab a free one like Irfanview which can crop and save most image formats.
Don't forget to credit the original scanner and the site when your repost.
Good luck,
-Yoc
-
Wordpress allows you to crop any images. A fact I didn't realize when I first started blogging on it. Saves me some time, if I haven't already cropped the image. My usual practice is to crop images in paint--including some images I've saved from this site. I don't know how to sharpen the images--I guess you need photoshop for that, which I don't have.
-
Irfanview allows sharpen as does free paint .net
-
If you download Irfanview and the available plug-ins from them you can use Loss-less jpg editing which might cut down on the blurring associated with editing jpgs. Irfanview is even able to use some Photoshop plugins out there.
-Yoc
Yes, I use Irfanview a lot
-
Hi there,
I
-
I use Firefox with the "DownThemAll!" add-on. I haven't had any time-out problems; you might want to try it.