Georgia Tech researchers recently presented their work at leading programming and systems conferences, focusing on static ...
Abstract: The emergence of Large Language Models (LLMs) has driven significant advancements in Natural Language Processing (NLP) and introduced new text-related applications, such as Visual Question ...
Abstract: Structure-guided image completion aims to inpaint a local region of an image according to an input guidance map from users. While such a task enables many practical applications for ...
UniPixel is a unified MLLM for pixel-level vision-language understanding. It flexibly supports a variety of fine-grained tasks, including image/video segmentation, regional understanding, and a novel ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results