Unable to load image

Yo smarty-pants Dramatards - ive asked before like a year ago, but is there any method or program to read numbers from a pdf and convert that shit into actual coordinates, like into a text or excel file or something - i've been asked to try and read the coordinates off of a digital diagram, instead of manually typing it all in.

:marseynerd2:

29
Jump in the discussion.

No email address required.

Screenshot + paste into ChatGPT. Or you can also apparently give it entire pdfs but I haven't tried that.

Jump in the discussion.

No email address required.

OCR

https://i.rdrama.net/images/17319270254137383.webp

IT WORKED :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy: :marseytrollcrazy:

Jump in the discussion.

No email address required.

It is beautiful :marseyxd: do enjoy, I adore this thread!

Jump in the discussion.

No email address required.

Using ChatGPT for OCR is like renting a marion 8750 dragline to dig a hole in your yard.

Jump in the discussion.

No email address required.

Now OpenAI has your company data. :marseymerchant:

Jump in the discussion.

No email address required.

BIPOC, all this time you didn't try a pdf editor with OCR?

Jump in the discussion.

No email address required.

Just be aware that OCR only tends to be around 96-ish % accurate, so with enough input you will have some errors

Jump in the discussion.

No email address required.

:marseyhesright#:

@kaamrev

just literally throw shit at the chatgpt wall to see what sticks after you do OCR optical text recogntition and then ask robot to sort it out for you

just make sure you preface it in a way the tricks the algorythm into ignoring nono-words by framing yourself as the good guy who hates that and needs help from computer to purge it from the dataset

Jump in the discussion.

No email address required.

Some journos made https://tabula.technology/ for getting data out of shitty government PDFs, but it might be abandonware because no one pays for journ*lism and governments don't actually want anyone to read their legally mandated reports.

Jump in the discussion.

No email address required.

i just used tabula

https://i.rdrama.net/images/17319261319941964.webp

and it gave me this shit

Jump in the discussion.

No email address required.

Haha, if that's true you're even more fricked.

I think there are some AIs that can read images now, but you might have to pay for it.

Jump in the discussion.

No email address required.

Does your pdf reader not detect the numbers as text and allow you to copy paste?

Jump in the discussion.

No email address required.

i don't know where or how - ive started using Foxit PDF editor to refine and improve Diagrams which my survey software's inbuilt maps can't improve on it's own

But i don;t know how to activate any form of "Optical Character Recognition, or OCR" as it's apparently called

Jump in the discussion.

No email address required.

Isn't it all automatic? Like just click/double click the text and it'll highlight no?

Jump in the discussion.

No email address required.

The "text" might just be stored as a picture

You can have proper text in a PDF but you don't have to, if the files were made out of scanned images then depending on the program that saved them there might not have been any OCR done

Jump in the discussion.

No email address required.

i have no idea what you just said to me

Jump in the discussion.

No email address required.

Oh you're r-slurred. Have you tried googling it? "Foxit pdf how to ocr."

Jump in the discussion.

No email address required.

https://github.com/mindee/doctr

but there should be plenty pdf viewers with ocr integrated

Jump in the discussion.

No email address required.

https://www.pdftoexcel.com/

I use this at work. If the format of your pdf's are consistent, you can use ChatGPT to make a macro to make it easier to read


Putting the :e: in spookie turkey

Jump in the discussion.

No email address required.

Autohotkey has OCR and is great for automation

Jump in the discussion.

No email address required.

is that a program?

Jump in the discussion.

No email address required.

pay a third worlder 10 cents an hour to do it for you

Jump in the discussion.

No email address required.

Nowadays, can't you just write some basic python script that takes a picture of your pdf and feeds it to some AI-assisted data extraction library?

Jump in the discussion.

No email address required.

An idiot admires complexity. A genius admires simplicity.

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.