Yo smarty-pants Dramatards - ive asked before like a year ago, but is there any method or program to read numbers from a pdf and convert that shit into actual coordinates, like into a text or excel file or something - i've been asked to try and read the coordinates off of a digital diagram, instead of manually typing it all in.
- 24
- 29
Jump in the discussion.
No email address required.
Screenshot + paste into ChatGPT. Or you can also apparently give it entire pdfs but I haven't tried that.
Jump in the discussion.
No email address required.
IT WORKED
Jump in the discussion.
No email address required.
It is beautiful do enjoy, I adore this thread!
Jump in the discussion.
No email address required.
Using ChatGPT for OCR is like renting a marion 8750 dragline to dig a hole in your yard.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Now OpenAI has your company data.
Jump in the discussion.
No email address required.
More options
Context
BIPOC, all this time you didn't try a pdf editor with OCR?
Jump in the discussion.
No email address required.
More options
Context
Just be aware that OCR only tends to be around 96-ish % accurate, so with enough input you will have some errors
Jump in the discussion.
No email address required.
More options
Context
More options
Context
@kaamrev
just literally throw shit at the chatgpt wall to see what sticks after you do OCR optical text recogntition and then ask robot to sort it out for you
just make sure you preface it in a way the tricks the algorythm into ignoring nono-words by framing yourself as the good guy who hates that and needs help from computer to purge it from the dataset
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Some journos made https://tabula.technology/ for getting data out of shitty government PDFs, but it might be abandonware because no one pays for journ*lism and governments don't actually want anyone to read their legally mandated reports.
Jump in the discussion.
No email address required.
i just used tabula
and it gave me this shit
Jump in the discussion.
No email address required.
Haha, if that's true you're even more fricked.
I think there are some AIs that can read images now, but you might have to pay for it.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
Does your pdf reader not detect the numbers as text and allow you to copy paste?
Jump in the discussion.
No email address required.
i don't know where or how - ive started using Foxit PDF editor to refine and improve Diagrams which my survey software's inbuilt maps can't improve on it's own
But i don;t know how to activate any form of "Optical Character Recognition, or OCR" as it's apparently called
Jump in the discussion.
No email address required.
Isn't it all automatic? Like just click/double click the text and it'll highlight no?
Jump in the discussion.
No email address required.
The "text" might just be stored as a picture
You can have proper text in a PDF but you don't have to, if the files were made out of scanned images then depending on the program that saved them there might not have been any OCR done
Jump in the discussion.
No email address required.
More options
Context
i have no idea what you just said to me
Jump in the discussion.
No email address required.
Oh you're r-slurred. Have you tried googling it? "Foxit pdf how to ocr."
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
More options
Context
https://github.com/mindee/doctr
but there should be plenty pdf viewers with ocr integrated
Jump in the discussion.
No email address required.
More options
Context
https://www.pdftoexcel.com/
I use this at work. If the format of your pdf's are consistent, you can use ChatGPT to make a macro to make it easier to read
Putting the in
spookieturkeyJump in the discussion.
No email address required.
More options
Context
Autohotkey has OCR and is great for automation
Jump in the discussion.
No email address required.
is that a program?
Jump in the discussion.
No email address required.
More options
Context
More options
Context
pay a third worlder 10 cents an hour to do it for you
Jump in the discussion.
No email address required.
More options
Context
Nowadays, can't you just write some basic python script that takes a picture of your pdf and feeds it to some AI-assisted data extraction library?
Jump in the discussion.
No email address required.
More options
Context
An idiot admires complexity. A genius admires simplicity.
Jump in the discussion.
No email address required.
More options
Context