Anyone know how to do this?

GlessasBakerfield@lemmy.ml · 2 years ago

Anyone know how to do this?

commet-alt-w@lemmygrad.ml · 2 years ago

my first question would be what book is it? because some might have already published a pirated copy of it somewhere. might just be difficult to find.

i saw someone mentioning making a wget script to download each page, which is what i would start thinking.

the other thing that would be worth trying is the wkhtmltopdf tool written in python. it can be easily installed through pypi with pip. not sure what it would take to crawl and paginate the whole book without trying to experiment with the website tho. if there’s authentication involved and api layer security for their system it may become difficult/cumbersome to script downloading it

crawling and paginating the webpages can be done with python too, using scrapy, or python’s own urlib/http.parser libraries. pretty sure scrappy is the framework that uses a browser to scrape the web with, so you can do something like login on the webpage without worrying about something like headers in authentication requests through curl or wget, or without worrying about some crud/rest library for interacting with an api

it’s one of those problems where there’s no easy general solution to if the system, the website op is using, itself does not provide it as feature and heavily drm’s/restricts access

GlessasBakerfield@lemmy.ml · 2 years ago

ISBN number is : 9781260706338

commet-alt-w@lemmygrad.ml · 2 years ago

looks like it’s up for download on academia.edu for free https://www.academia.edu/36089542/Financial_and_Managerial_Accounting_THE_BASIS_FOR_BUSINESS_DECISIONS_Final_PDF_to_printer

GlessasBakerfield@lemmy.ml · 2 years ago

thank you

rcbrk@lemmy.ml · 2 years ago

If a raster image version would be ok, one avenue to look into is opening up the network tab in developer tools in Firefox, and browsing through all the pages. You may need to zoom in to ensure high enough resolution images are loaded.

At the end of this, click the cog button and “save all as HAR file”.

Then you’ll need to write some code to extract and correctly order all the page images then feed that into a pdf generator tool.

I got as far as manually inspecting the HAR file with some tool and noting that page images are indeed there, but my need for the textbooks lapsed so I did no further work.

jumanjimanju@lemmy.ml · edit-2 2 years ago

deleted by creator

GlessasBakerfield@lemmy.ml · 2 years ago

archived chats https://web.archive.org/web/20161103165748/https://www.reddit.com/r/Piracy/comments/3lc85v/creating_a_pdf_file_out_of_an_online_textbook/

Anyone know how to do this?

Anyone know how to do this?

r/Piracy - Creating a PDF file out of an online textbook through Mcgraw Hill Connect.