#4. Segment anything, publication backlog, ChatGPT as a lawyer
New publications, some thoughts on Meta AI's SAM and other recent "AI Hype" news
Hello,
Back in March I had the pleasure of giving a lecture on large language models and ChatGPT to law students. It was fun. Unfortunately it wasn’t possible to record in the auditorium we were in, but I’ve put the slides and notes (in french) online for anyone interested [PDF]. In other “AI hype” related news, the call for a “AI pause” was a bit absurd, but I didn’t really write much about it because Timnit Gebru et al. have done a much better job of criticizing it than I could [dair-institute.org]. Another thing I started to write about before seeing someone do the job better is the story of the (probably soon ex-)lawyers who thought it would be a good idea to use ChatGPT to write a motion for them. It didn’t go well, and Devin Stone (aka Legal Eagle) has a very good video on the story on his YouTube channel [Youtube/@LegalEagle]. It would have been nice for this story to unfold before I gave my course to the law students, but at least it demonstrates that this kind of lessons may be useful to some…
SAM
Meta AI has released the “Segment Anything Model”, aka SAM, with a paper where they actually explain their methodology and dataset [arxiv:2304.02643] (see OpenAI, it’s not that hard!), a live demo on their website, and open-source code on GitHub. While I’d rather not have Meta set the standard for computer vision, at least they seem to be doing some proper science.
The results are really impressive, and some preliminary tests on medical images [arxiv:2304.05396, arxiv:2304.04155] show results which, while not really state-of-the-art and not fully automated, are still quite good for a system that was not trained on medical images at all.
As always with these very large models, it’s a bit difficult to determines whether the good results are a product of the “technology” (i.e. the visual transformers) or of the huge dataset & resources. When I read “We distribute training across 256 GPUs, due to the large image encoder and 1024×1024 input size,” it’s hard to make any relevant comparison to results I could obtain with my setup…
To summarize: potentially interesting results, strongly doubt that it’s a good direction for medical imaging, but at least Meta AI is being more responsible in their research than Microsoft and OpenAI.
On the research blog
Sometimes I write about “general interest” topics, such as ChatGPT. Sometimes the content is a bit more niche. That’s certainly the case for this one: “Misusing Hausdorff’s Distance”, where I look at how scikit-image’s implementation of this segmentation metric is sometimes misunderstood, leading to incorrect results.
Publication
Finally, the last two papers from my PhD work have made it through the sometimes excruciantingly slow scientific publishing process.
The first one was submitted in December 2021, and accepted in April 2023. I honestly don’t understand why I took that long, as the end result is very close to the original draft I submitted so the added value of the peer-review process is limited at best in this case, but here we are: “Evaluating participating methods in image analysis challenges: lessons from MoNuSAC 2020” has been accepted in Pattern Recognition and is now available online [doi:10.1016/j.patcog.2023.109600]. I also briefly explain what’s in it on the research blog.
The second one went a lot faster, and is available in Scientific Reports [doi:10.1038/s41598-023-35605-7]: “Panoptic quality should be avoided as a metric for assessing cell nuclei segmentation and classification in digital pathology”. Where the Pattern Recognition paper looks at how complex metrics for complex tasks should be avoided in general in favour of simple “disentangled” metrics per sub-task, the Scientific Reports paper is more straightforward and focuses on why Panoptic Quality is a particularly poor choice for nuclei instance segmentation and classification.
…and this concludes this newsletter.
Until next time, have a nice day !
Adrien.