Mariia (Masha) Baidachna
- Undergraduate Student
University of Glasgow
Attended ICCV (International Conference on Computer Vision)
2025, Honolulu, Hawaii, United States
Attending ICCV was an amazing experience overall! The amount of information transferred in a short time span was incredible. Walking around the hall with over 400 posters, which rotated every few hours, made me feel like a sponge, constantly absorbing new ideas from the latest publications. Presenting my own poster was also a really cool experience. It was rewarding to make connections with people whose work I admire and have them come up and listen to my research in return. The community that ICCV curates is definitely one of the most up to date ones when it comes to discussing state-of-the-art research. And of course, visiting Hawaii was a privilege in itself. It was a very productive escape from the cold, dark, and rainy Scottish autumn.
Here are some concrete takeaways and open-ended questions I collected from talks and posters (my focus was visual reasoning and autonomous driving):
Should agents for GUI related tasks be reliant on perception, code, or underlying text/captions?
– When we use some novel method that improves a certain functionality of an LLM (for example reasoning for visual tasks) how can we be sure the model still performs the same on other tasks and doesn’t actually degrade in performance overall?
– How far can we generalise reasoning? CoT to CoS? Algebraic to geometric?
– ViT uses image patches as input, not the objects. This may contribute to the counting problem, where LLMs find it difficult to count objects in an image. Is it scale invariant?
– The majority of academics are using frozen weights and avoiding training as much as possible.
– Synthetic aperture radar has a lot of similarities with the capturing of black holes with the EHT.
– Smoothing of sorts between datapoints can be used to make videos.
