Yonghong Tian is currently a professor with the National Engineering Laboratory for Video Technology, School of Electronics Engineering and Computer Science, Peking University, Beijing, China. He received the Ph.D. degree from the Institute of Computing Technology, Chinese Academy of Sciences, China, in 2005. His research interests include machine learning, computer vision, multimedia analysis and coding. He is the author or coauthor of over 100 technical articles in refereed journals and Conf.s. Dr. Tian is currently a Young Associate Editor of the FRONTIERS OF COMPUTER SCIENCE IN CHINA, the guest editor of the special issue on “Social Multimedia Computing: Challenges, Techniques, and Applications” at Journal of Multimedia, a member of the IEEE TCMC-TCSEM Joint Executive Committee in Asia (JECA), the program co-chair of the first IEEE International Conference on Multimedia Big Data (BigMM 2015). He was a recipient of the Second Prize of National Science and Technology Progress Awards in 2010; the best performer in the TRECVID content-based copy detection task from 2010 to 2011; the top performer in the TRECVID retrospective surveillance event detection task from 2009 to 2012; and the winner of the WikipediaMM task in ImageCLEF 2008. He is a senior member of IEEE, a member of ACM.
For various video-related applications, high-efficiency scalable video coding and foreground visual object representation are two of the most importantenabling technologies. On one hand, with the exponentially increasing usage of video teleconferencing and real-time traffic monitoring, the high-bit-rate video streams are often required to be real-timely and simultaneously coded or transcoded into multiple quality-maintained low-bit-rate videos for the various bandwidths of client devices. On the other hand, it is crucial to represent foreground objects with arbitrary shape in the coded video stream. For example, in the video conference applications, the conferees’ figures in each camera may be directly extracted and be further blended into a virtual conference room. In this talk, I will present our recent work on object representation, coding analysis for conferencing and surveillance video applications, including: 1) the methods to represent and encode the object shape in the HEVC coding loop with a small bitrate cost; 2) low-complexity background modeling and saliency-based object segmentation to segment visual objects from images and videos; 3) the model-based high-efficient scalable coding for conference and surveillance videos; 4) the methods and some results by integrating video coding and video analysis in a framework.