How is Biology Research handling new Tech?


								

								

>Submit paper on ML algorithm for spatial transcriptomics :marseysnappy:

>One reviewer insists we benchmark it against his two year old methodology with like 2 citations that no one knows or cares about :marseysmug2:

>Since scientists can't make repeatable or documented code the entire thing is hard coded to his one training dataset meaning I have to rewrite it :marseygigaretard:

>Rather then provide a GUIX archive or pack or a docker image they just have a list of pip packages :marseyboardcode:

>they are all so outdated they refuse to install the right version :marseytabletired:

:marseyeyelidpulling: I ofc have to finish this in 30 days while I also have two class projects due in 20 days plus studying for my finals. I love having to do the same work as PHD and post grad students for less pay :marseywagie:

TLDR; !linuxchads !fosstards !biology Despite having cowtools to create instantly deployable exact decency loaders via docker or GUIX scientists can apparently pass unreplicatable and hardcoded code past reviewers. :marseythumbsup:

Pip3 download speeds are prob butt (like 600 kb/s) because there is one gorillion AWS and Azure vms for ML trying to set up at the same time and amazon and Microsoft cant arsed to use their billions of dollars to make their own mirrors so they just leach off FOSS projects

52
Jump in the discussion.

No email address required.

As much as I like to shit on sexy Indian dudes and webshit devs, reading code from researchers is next level heck. Kinda messed up that he's asking you to pad his own work, I don't know much about submitting research papers, is that common?


:#marseyviewerstaretalking:

Jump in the discussion.

No email address required.

This is a chink researcher ofc. And yeah this is pretty common in the tech side of biology. Since its "easy" reviewers will demand you benchmark every tool against every other tool in existence ignoring how temperamental and poorly documented all these POS are.

Jump in the discussion.

No email address required.

One reviewer insists we benchmark it against his two year old methodology

They didn't know a lot about what you were talking about and this was a way to make them seem smart.

It's the nerd version of hazing.

Jump in the discussion.

No email address required.

Since scientists can't make repeatable or documented code the entire thing is hard coded to his one training dataset meaning I have to rewrite it :marseygigaretard:

the fricking absolute state of academia

Jump in the discussion.

No email address required.

This guy is holding my paper hostage presumably so he can eek one more citation from this POS.

Jump in the discussion.

No email address required.

You should try (and fail) to replicate his work and publish that, spite is good

Jump in the discussion.

No email address required.

What do you mean? It's all right there... In the excel workbook

:#marseygigachadtalking:

Jump in the discussion.

No email address required.

I know a couple CS professors, at their school every single person involved went straight into academia when they finished their own education. One of them didn't finish his PhD untill he was already teaching. None of them have a single lick of real world experience, they don't know what's actually useful or being used these days. Anything slightly newish like Docker is not only foreign and unknown to them but totally uninterests them. Lack of curiosity defines their computer use. They don't really seem to know jack shit about linux and 90% of their work appears to be done in Visual Basic via a VM from a MacBook. I can't imagine the situation where computers meet applied sciences can be good.

Jump in the discussion.

No email address required.

I do all my work on a macbook ssh'd into a linux enterprise machine and code everything in nano (no syntax highlighting) :marseybigbrain:. Comp sci cels cant even provide the version of python they used at minimum so I have to guess what exact version works with the unearthly dependency chain.

Jump in the discussion.

No email address required.

code everything in nano

:marseygigaretard:

Jump in the discussion.

No email address required.

code everything in nano

https://media.tenor.com/KjJTBQ9lftsAAAAx/why-huh.webp

Jump in the discussion.

No email address required.

Please use vscode remote instead

Jump in the discussion.

No email address required.

That's the fricking problem I always had with LISP-y languages. They always came from the fricking ivory towers of academia and lacked so many real world utilities.

Jump in the discussion.

No email address required.

This is why I love stata. All of its models and commands are designed by professors (economists) who painstakingly write out the documentation. The problem with python cowtools is that you don't know which method and which underlying assumptions are at play (because the !fosstards who wrote it have no idea either). Must be heck using it for anything involving serious statistics.

!codecels, bring out the pythonistas. I want to wring your necks!

:#marseyfrozenchosenchokespal:

Jump in the discussion.

No email address required.

I unironically love Python

Jump in the discussion.

No email address required.

The sooner D replaces python the better

Jump in the discussion.

No email address required.

frick you

Jump in the discussion.

No email address required.

I love Haskell, I think it hits an amazing mix of functional purity and a deeply expressive type system.

Jump in the discussion.

No email address required.

assert.isTrue(true)

:marseyme: 100% coverage achieved.

Jump in the discussion.

No email address required.

Reported by:
  • DickButtKiss : I dont believe it. Sounds like bullshit. There's something nefarious behind the scenes

Another place had freak out over a possible disappearing crystal contamination. HUGE :marseycracka: drive to triple check everything over the next few days make sure nothing is fricked.

:marseychemist2:

Studying polymorphism is important because many drugs receive regulatory approval for only a particular crystal form of a compound. But polymorphs can disappear - where it becomes difficult to continue growing a particular crystal form. New, more stable, forms are grown. This happened to a HIV drug called ritonavir. Over time, a second crystal form emerged, which was less soluble and so less effective as a drug. Ritonavir had to be pulled from the market while researchers worked to solve the problem.

https://www.chemistryworld.com/news/the-mystery-of-the-disappearing-crystals/3003967.article

!chuds required reading

Jump in the discussion.

No email address required.

Science is epic :marseyhappyjump:

Jump in the discussion.

No email address required.

Dogshit code is a part of the field. I do enjoy the silly design decisions you encounter when dealing with it.

Pip speeds are crappy because pypi was never intended for large wheels. Unfortunately the big ml packages shit all over the normal rules and have everyone downloading gigabyte Nvidia silo binaries for no good reason.

Jump in the discussion.

No email address required.

I have just never seen anything this hard coded :marseycringe:. Like as a reviewer you think question 1 would be "can this code be tested on other data". Like what is the point of publishing an ML model which works on only two datasets :marseyyikes:

Jump in the discussion.

No email address required.

It's pretty common. I often see models so overfitted that it's basically a top hat filter. "It works well on my data" is all these researchers understand

Jump in the discussion.

No email address required.

have everyone downloading gigabyte Nvidia silo binaries for no good reason.

I have to download this 4gb file to show a 15 second video of a black guy making a joke but it's somehow for "research". :marseyschizotwitch:

Jump in the discussion.

No email address required.

Since scientists can't make repeatable or documented code the entire thing is hard coded to his one training dataset meaning I have to rewrite it

Just dont lmao. The editor will understand if you explain it to them. Which to be fair is only true if the editor is not r-slurred.

And unless you are going for a top tier journal (which you shouldnt because effort risk funding connections etc) submit it to multiple journals at the same time. If an editor finds out and calls you out on it just say it was an accident and apologize.

Jump in the discussion.

No email address required.

Sorry, ML "algorithm" for spatial transcriptomics? What is it really?

Jump in the discussion.

No email address required.

No idea its what the phd students are doing. I just touch the benchmarking of other algorithms for the paper :marseycry:. From what I understand it's an ML algorithm to find the boundaries of cell types in spatial transcription data in cancer patients. As rn cancer research is on fire with cell-to-cell communications research of reach spatial analysis is a large part.

Jump in the discussion.

No email address required.

It'd be absolutely wild if you knew this guy I used to know.

Jump in the discussion.

No email address required.

You seem to be quite the GUIX fanboy as of late. Why should I use it over Nix?

Jump in the discussion.

No email address required.

No particular reason other than it has a bunch of biology based packages and I personally don't like NIX's custom scripting language.

Jump in the discussion.

No email address required.

I mean truth be told it kind of does make more sense to use a lisp dialect for scripting in general since it's the only lang off the top of my head where knowing one means you know them all, etc etc

Jump in the discussion.

No email address required.

i only worked with bioinformatics once and they had done their work in julia. That was ~10 years ago, I hate them so much.

Jump in the discussion.

No email address required.

Juilia was basically killed when python got faster. Only positive of Julia is that it can work with gpus without the mess that is cuda Nvidia slop.

Jump in the discussion.

No email address required.

is..is that....dare i say, S-S-S-SOVL?????????????????

uhhh ughhh UGHHHH SSOUVL!!!!

aaaaaAAAAAaaaAAAAAAAAAAAAaAAAAaaaa SSSSOOOOVLLLLLLLL SOVL SOVL SOVL SOVL!!!!!!!!!!!!!!!

ohh! OHHH OHHH GOD Im SOVVVLLINGGG!!!!!!

SOVL!

OU

L!

SOVL!

OU

L!

SOVL!

OU

L!

Jump in the discussion.

No email address required.

they just have a list of pip packages

Most research scientists aren't even enlightened enough to do this!

Pip3 download speeds are prob butt (like 600 kb/s)

If you can keep your dependencies managed entirely by Pip (and don't have to shoot yourself by touching Conda), definitely use one of the dependency managers with a lockfile like PDM (best) or Poetry. Installs go like 5-10x faster because you're not fighting dependency resolution at every step, and you have a reference of every version of every dependency in the lockfile. Also works really well if you're building a Docker image.

But yeah you can't truly hate Python until you read researcher code. I'm convinced many of them don't even understand data types; total dynamic typing death please.

Jump in the discussion.

No email address required.

If you want a package manager try uv. It came out recently and blows everything else out of the water for performance and ease of use

Jump in the discussion.

No email address required.

Oh nice, will have to look at uv. They made ruff too, which also blows all of the other linters/formatters out of the water in both performance and usability. The Python Packaging Authority should kill themselves for being outdone by another group of randos for the THIRD time.

Jump in the discussion.

No email address required.

AL ML/AI in biology is useless and so is transcriptomics.

Jump in the discussion.

No email address required.

wait i thought ai solved biotech already, why are u still doing "research" :marseyconfused:

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.