Students who want to take a bachelor thesis should have attended at least one lecture of Prof. Zell and passed it with good or at least satisfactory grades. They might also have obtained the relevant background knowledge for the thesis from other, similar lectures.
Mentor: Andreas Ziegler
Email: andreas.zieglerspam prevention@uni-tuebingen.de
Intelligent interaction with the physical world requires perceptual abilities beyond vision and hearing; vibrant tactile sensing is essential for autonomous robots to dexterously manipulate unfamiliar objects or safely contact humans. Therefore, robotic manipulators need high-resolution touch sensors that are compact, robust, inexpensive, and efficient. In recent work, our collaborators at MPI presented Minsight [1], a soft vision-based haptic sensor, which is a miniaturized and optimized version of the previously published sensor Insight. Minsight has the size and shape of a human fingertip and uses machine learning methods to output high-resolution maps of 3D contact force vectors at 60 Hz.
To look into the high frequency aspect of textures, an update rate of 60 Hz is not enough. Event-based cameras [2] which become more and more popular could be a good alternative to the classical, frame-based camera used so far. Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.
In this thesis, the student is tasked to use a new, miniature event-based camera together with Deep Learning to bring Minsight to the next level.
The student should be familiar with Computer Vision, Deep Learning and ideally already used Deep Learning frameworks like PyTorch previously in projects or course work
[1] Andrussow, I., Sun, H., Kuchenbecker, K. J., Martius, G. Minsight: A Fingertip-Sized Vision-Based Tactile Sensor for Robotic Manipulation Advanced Intelligent Systems, 5(8):2300042, August 2023, Inside back cover
[2] G. Gallego, T. Delbruck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. Davison, J. Conradt, K. Daniilidis, D. Scaramuzza, Event-based Vision: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 1, pp. 154-180, 1 Jan. 2022.
Mentor: Andreas Ziegler
Email: andreas.zieglerspam prevention@uni-tuebingen.de
Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.
Currently, a practical obstacle to adoption of event camera technology is the high cost of several thousand dollars per camera, similar to the situation with early time of flight cameras. In a recent project [1] we developed an event simulator which takes frames from a conventional frame-based camera as input and outputs events in real-time.
The goal of this thesis is to evaluate the limits of this event simulator by applying it in different real-time use cases. Two such scenarios are real-time object tracking of a fast moving object and balancing a ball with a robot arm on a 2D plane.
The student should to be familiar with „traditional“ Computer Vision and Robotics. A good command of C++ or Python from previous projects would be beneficial.
[1] A. Ziegler, D. Teigland, J. Tebbe, T. Gossard, and A. Zell, “Real-time event simulation with frame-based cameras.” arXiv, Sep. 10, 2022. Accessed: Dec. 09, 2022. [Online]. Available: arxiv.org/abs/2209.04634
Mentor: Andreas Ziegler
Email: andreas.zieglerspam prevention@uni-tuebingen.de
Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.
State-of-the-art machine-learning methods for event cameras treat events as dense representations and process them with CNNs. Thus, they fail to maintain the sparsity and asynchronous nature of event data, thereby imposing significant computation and latency constraints. A recent line of work [1]–[5] tackles this issue by modeling events as spatio-temporally evolving graphs that can be efficiently and asynchronously processed using graph neural networks. These works showed impressive reductions in computation.
The goal of this thesis is to apply these Graph-based networks for ball detection with event cameras. Existing graph-based networks were designed for some more general object detection task [4], [5]. Since we only want to detect balls, in a first step, the student will investigate if a network architecture, targeted for our use case, could further improve the inference time.
The student should to be familiar with „traditional“ Computer Vision and Deep Learning. Experience with Python and PyTorch from previous projects would be beneficial.
[1] Y. Li et al., “Graph-based Asynchronous Event Processing for Rapid Object Recognition,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, Oct. 2021, pp. 914–923. doi: 10.1109/ICCV48922.2021.00097.
[2] Y. Deng, H. Chen, H. Liu, and Y. Li, “A Voxel Graph CNN for Object Classification with Event Cameras,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, Jun. 2022, pp. 1162–1171. doi: 10.1109/CVPR52688.2022.00124.
[3] A. Mitrokhin, Z. Hua, C. Fermuller, and Y. Aloimonos, “Learning Visual Motion Segmentation Using Event Surfaces,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, Jun. 2020, pp. 14402–14411. doi: 10.1109/CVPR42600.2020.01442.
[4] S. Schaefer, D. Gehrig, and D. Scaramuzza, “AEGNN: Asynchronous Event-based Graph Neural Networks,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, Jun. 2022, pp. 12361–12371. doi: 10.1109/CVPR52688.2022.01205.
[5] D. Gehrig and D. Scaramuzza, “Pushing the Limits of Asynchronous Graph-based Object Detection with Event Cameras.” arXiv, Nov. 22, 2022. Accessed: Dec. 16, 2022. [Online]. Available: arxiv.org/abs/2211.12324
Mentor: Andreas Ziegler
Email: andreas.zieglerspam prevention@uni-tuebingen.de
Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.
In comparison to image frames from conventional cameras, data from event-based cameras is much sparser in most cases. If this sparsity is taken into account, a deep-learning based detector can benefit from this sparsity and achieve a reduced inference time. The goal of this thesis is to use Asynchronous Sparse Convolutional Layers [1] and apply it in a neural network to detect fast moving table tennis balls in real-time.
The student should to be familiar with „traditional“ Computer Vision, Machine Learning/Deep Learning and Python. Prior experience of PyTorch would be beneficial.
[1] N. Messikommer, D. Gehrig, A. Loquercio, and D. Scaramuzza, “Event-based Asynchronous Sparse Convolutional Networks,” European Conference on Computer Vision. (ECCV) 2020. [Online]. Available: arxiv.org/abs/2003.09148
Mentor: Andreas Ziegler
Email: andreas.zieglerspam prevention@uni-tuebingen.de
Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.
Since event cameras report changes of intensity per pixel, their output resembles an image gradient where mainly edges and corners are present. The contrast maximization framework (CMax) [1] uses this fact by optimizing the sharpness of accumulated events to solve computer vision tasks like the estimation of motion, depth or optical flow. Most recent works on event-based (multi) object segmentation [2]–[4] applies this CMax framework. The common scheme is to jointly assign events to an objct and fit ting a motion model which best explains the data.
The goal of this thesis is to develop a real-time capable (multi) object tracking pipeline by applying multi object segmentation. After the student got familiar with the recent literature, a suitable multi object segmentation approach should be chosen and adjusted for our use case, namely a table tennis setup. Afterwards, different object tracking approaches should be developed, evaluated and compared against each other.
The student should to be familiar with „traditional“ Computer Vision. Experience with C++ and/or optimization from previous projects or coursework would be beneficial.
[1] G. Gallego, M. Gehrig, and D. Scaramuzza, “Focus Is All You Need: Loss Functions for Event-Based Vision,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 12272–12281. doi: 10.1109/CVPR.2019.01256.
[2] X. Lu, Y. Zhou, and S. Shen, “Event-based Motion Segmentation by Cascaded Two-Level Multi-Model Fitting.” arXiv, Nov. 05, 2021. Accessed: Jan. 05, 2023. [Online]. Available: http://arxiv.org/abs/2111.03483
[3] T. Stoffregen, G. Gallego, T. Drummond, L. Kleeman, and D. Scaramuzza, “Event-Based Motion Segmentation by Motion Compensation,” ArXiv190401293 Cs, Aug. 2019, Accessed: Jun. 14, 2021. [Online]. Available: http://arxiv.org/abs/1904.01293
[4] Y. Zhou, G. Gallego, X. Lu, S. Liu, and S. Shen, “Event-based Motion Segmentation with Spatio-Temporal Graph Cuts,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–13, 2021, doi: 10.1109/TNNLS.2021.3124580.
Mentor: Andreas Ziegler
Email: andreas.zieglerspam prevention@uni-tuebingen.de
Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of μs), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision.
So far, most learning approaches applied to event data, convert a batch of events into a tensor and then use conventional CNNs as network. While such approaches achieve state-of-the-art performance, they do not make use of the asynchronous nature of the event data. Spiking Neural Networks (SNNs) on the other hand are bio-inspired networks that can process output from event-based directly. SNNs process information conveyed as temporal spikes rather than numeric values. This makes SNNs an ideal counterpart for event-based cameras.
The goal of this thesis is to investigate and evaluate how a SNN can be used together with our event-based cameras to detect and track table tennis balls. The Cognitive Systems groups has a table tennis robot system, where the developed ball tracker can be used and compared to other methods.
Requirements: Familiar with "traditional" Computer Vision, Deep Learning, Python
Auf unserer Webseite werden Cookies verwendet. Einige davon werden zwingend benötigt, während es uns andere ermöglichen, Ihre Nutzererfahrung auf unserer Webseite zu verbessern. Ihre getroffenen Einstellungen können jederzeit bearbeitet werden.
oder
Essentiell
in2cookiemodal-selection
Erforderlich, um die Benutzerauswahl der Cookie-Einstellungen zu speichern.
3 Monate
be_lastLoginProvider
Benötigt, damit TYPO3 beim Backend-Login den Zeitpunkt des letzten Logins feststellen kann.
3 Monate
be_typo_user
Dieses Cookie teilt der Webseite mit, ob ein Besucher oder eine Besucherin zugleich im TYPO3-Backend angemeldet ist und die Rechte besitzt, die Webseite zu verwalten.
Sitzungsende
ROUTEID
Diese Cookies werden gesetzt, um den Benutzer oder die Benutzerin immer zum gleichen Server zu leiten.
Sitzungsende
fe_typo_user
Ermöglicht Frontend-Login.
Sitzungsende
Videos
iframeswitch
Wird verwendet, um eingebettete externe Inhalte Dritter anzuzeigen.
3 Monate
yt-player-bandaid-host
Wird verwendet, um YouTube-Videos anzuzeigen.
Beständig
yt-player-bandwidth
Wird verwendet, um die optimale Videoqualität basierend auf den Geräte- und Netzwerkeinstellungen des Besuchers oder der Besucherin zu bestimmen.
Beständig
yt-remote-connected-devices
Speichert die Einstellungen des Videoplayers des Benutzers oder der Benutzerin unter Verwendung von eingebettetem YouTube-Video.
Beständig
yt-remote-device-id
Speichert die Einstellungen des Videoplayers des Benutzers oder der Benutzerin unter Verwendung von eingebettetem YouTube-Video.
Beständig
yt-player-headers-readable
Sammelt Daten über die Interaktion der Besucher mit den Videoinhalten der Website - Diese Daten werden verwendet, um die Relevanz der Videoinhalte der Website für den Besucher zu erhöhen.
Beständig
yt-player-volume
Wird verwendet, um die bevorzugte Lautstärke der YouTube-Videos zu speichern.
Beständig
yt-player-quality
Wird verwendet, um die bevorzugte YouTube Wiedergabequalität zu speichern.
Beständig
yt-remote-session-name
Speichert die Einstellungen des Videoplayers des Benutzers oder der Benutzerin unter Verwendung von eingebettetem YouTube-Video.
Sitzungsende
yt-remote-session-app
Speichert die Einstellungen des Videoplayers des Benutzers oder der Benutzerin unter Verwendung von eingebettetem YouTube-Video.
Sitzungsende
yt-remote-fast-check-period
Speichert die Einstellungen des Videoplayers des Benutzers oder der Benutzerin unter Verwendung von eingebettetem YouTube-Video.
Sitzungsende
yt-remote-cast-installed
Speichert die Benutzereinstellungen beim Abruf eines auf anderen Webseiten integrierten YouTube-Videos.
Sitzungsende
yt-remote-cast-available
Speichert die Benutzereinstellungen beim Abruf von integrierten YouTube-Videos.
Sitzungsende
ANID
Wird für Targetingzwecke verwendet, um ein Profil der Interessen der Website-Besucher zu erstellen, um relevante und personalisierte Google-Werbung anzuzeigen.
2 Jahre
SNID
Google Maps - Google verwendet diese Cookies, um Benutzereinstellungen und Informationen zu speichern, wenn Sie Seiten mit Google Maps aufrufen.
1 Monat
SSID
Wird verwendet, um Informationen darüber zu speichern, wie Sie die Website nutzen und welche Werbung Sie vor dem Besuch dieser Website gesehen haben, und um die Werbung auf Google-Ressourcen anzupassen, indem Sie sich an Ihre letzten Suchanfragen, Ihre früheren Interaktionen mit Anzeigen oder Suchergebnissen eines Werbetreibenden und Ihre Besuche auf einer Website eines Werbetreibenden erinnern.
6 Monate
1P_JAR
Dieses Cookie wird verwendet, um die Werbedienste von Google zu unterstützen
1 Monat
SAPISID
Wird für Targetingzwecke verwendet, um ein Profil der Interessen der Website-Besucher zu erstellen, um relevante und personalisierte Google-Werbung anzuzeigen.
6 Monate
APISID
Wird für Targetingzwecke verwendet, um ein Profil der Interessen der Website-Besucher zu erstellen, um relevante und personalisierte Google-Werbung anzuzeigen.
6 Monate
HSID
Beinhaltet verschlüsselte Einträge Ihres Google Accounts und der letzten Login-Zeit um vor Attacken und Datendiebstahl aus Formulareinträgen zu schützen.
2 Jahre
SID
Wird zu Sicherheitszwecken verwendet, um digital signierte und verschlüsselte Aufzeichnungen der Google-Konto-ID eines Nutzers und der letzten Anmeldezeit zu speichern, die es Google ermöglichen, Nutzer zu authentifizieren, eine betrügerische Verwendung von Anmeldeinformationen zu verhindern und Benutzerdaten vor Unbefugten zu schützen. Dies kann auch für Targetingzwecke genutzt werden, um relevante und personalisierte Werbeinhalte anzuzeigen.
6 Monate
SIDCC
Dieses Cookie speichert Informationen über Nutzereinstellungen und -informationen für Google Maps.
3 Monate
NID
Das NID-Cookie enthält eine eindeutige ID, die Google verwendet, um Ihre Einstellungen und andere Informationen zu speichern.
6 Monate
CONSENT
Dieses Cookie verfolgt, wie Sie eine Website nutzen, um Ihnen Werbung zu zeigen, die für Sie interessant sein könnte.
18 Jahre
__Secure-3PAPISID
Dieses Cookie wird verwendet, um die Werbedienste von Google zu unterstützen
2 Jahre
__Secure-3PSID
Dieses Cookie wird verwendet, um die Werbedienste von Google zu unterstützen
2 Jahre
__Secure-3PSIDCC
Dieses Cookie wird verwendet, um die Werbedienste von Google zu unterstützen.
6 Monate