All slides and pictures in the symposium have uploaded at 12 am on August 20, 2013


It is a pleasure to welcome you to International Symposium on Social Multimedia and Cyber-Physical-Social Computing which will be held in Beijing, China during Auguest 15-16, 2013. Please click here for the workshop program. Informatin on all talks and lecturers can be found on this page.

As an emerging cross-disciplinary research area, Cyber-Physical-Social (CPS) computing is attracting attention worldwide, which features a combination of computational, physical, and social elements. It is capable of interacting, reflecting and influencing each other. In fact CPS computing is starting to affect every aspect of human life and is likely become increasingly pervasive. In the meantime, social multimedia is evolving with the popularity of online social networking to shape the cyber world, and serves as key means to describe the physical world.

With the advent of ubiquitous sensing, networking and the increasing computing personal computing capacity, future online social networks will be more involved into cyber-physical interactions, which suggests increasing need and research for applying social multimedia into CPS computing. We believe that emerging techniques from social multimedia may play key role in the development of CPS computing.

The potential research should proceed along two parallel lines: fundamentals and applications. On one hand, the convergence of the social, multimedia, cyber, and physical will exhibit a variety of novel characteristics that may need development and exploration of new concepts. This may mean identifying many open issues and challenges for research communities. On the other hand, the combination of the social multimedia and CPS computing will significantly change the way we see the world, benefit from the world and change the world. Cross-disciplinary applications, e.g., personal healthcare and public security are some of the first to be explored. As in any mature discipline fundamentals and applications must build on each other to establish the discipline; fundamentals without applications soon become useless and applications without fundamentals are shortlived. At the workshop, we will have good brainstorming sessions to identify important directions to pursue fundamental research while addressing solid applications that will identify and build basis for building CPS.




Talk 1:       9:45am - 10:15am, Aug. 15, 2013

Speaker:   Prof. Ramesh Jain, Information & Computer Sciences at University of California

Title:       Challenges in Building Social Life Networks

Slide:         Pics:

Abstract:     Availability of enormous volumes of heterogeneous geo-spatial data streams offer a great opportunity for addressing some important societal problems, which require addressing several interesting challenges. Heterogeneous data streams generated by social media, sensor networks, internet of things, and digitalization of transactions may allow design and implementation of networks to connect people with other people and essential life resources. We call these networks Social Life Networks (SLN). This is the right time to focus efforts to discover and develop technology and infrastructure to design and build these networks and to apply them for solving several essential societal problems to help people in their everyday life as well as during abnormal situations. A person needs to be connected to appropriate resources under the given situations. Situations in a SLN should be detected by using heterogeneous data streams. We are building a software framework for situation recognition and determining personal context to connect people to resources efficiently, effectively, and timely. This is only the very first step in connecting people to needed resources using SLN. We will discuss our ideas and present our research in situation recognition, EventShop, and making connections.

Biography:     Prof. Ramesh Jain is an entrepreneur, researcher, and educator. He co-founded several companies, managed them in initial stages, and then turned them over to professional management. These companies include PRAJA, Virage, and ImageWare. Currently he is involved in Stikco Studio. He has also been advisor to several other companies including some of the largest companies in media and search space.
He is a Donald Bren Professor in Information & Computer Sciences at University of California, Irvine where he is doing research in Event Web and experiential computing. Earlier he served on faculty of Georgia Tech, University of California at San Diego, The university of Michigan, Ann Arbor, Wayne State University, and Indian Institute of Technology, Kharagpur. He is a Fellow of ACM, IEEE, AAAI, IAPR, and SPIE. His current research interests are in searching multimedia data and creating EventWebs for experiential computing. He is the recipient of several awards including the ACM SIGMM Technical Achievement Award 2010.

Talk 2:       10:15am - 10:40am, Aug. 15, 2013

Speaker:   Prof. Koji Zettsu, National Institute for Communication Technology

Title:       Cross-Domain Access to Cyber-Physical-Social Event Data

Slide:         Pics:

Abstract:     Nowadays with the advent of Big Data, vast and heterogeneous information reflecting natural environments and social activities can be found everywhere on the Internet such as sensing data, Web and social media. While most of current applications focus on performance improvement individually with high volume data, discovering correlations among heterogeneous event data over multiple domains or applications is crucial for capturing unforeseen, complex situation like natural disaster responses. For that, NICT Information Services Platform aims at extending conventional event information management to support cross-domain access to the event data. Based on a uniformed event data model, it provides an event warehouse for collection and integration of multi-domain, heterogeneous sensor data from physical, cyber and social spaces. It also facilitates search of event warehouse for correlating datasets based on spatial, temporal, ontological and citational correlation analysis. This talk introduces those technologies with a future vision.

Biography:     Dr. Koji Zettsu (Non-member) received B.E. degree from Tokyo Institute of Technology in 1992 and Ph.D in Informatics from Kyoto University in 2005. He is a Director of Information Services Platform Laboratory at Universal Communication Research Institute of National Institute of Information and Communications Technology (NICT), Japan. He was a visiting associate professor of Kyoto University from 2008 to 2012. He was a visiting researcher of Christian-Albrechts-University Kiel, Germany in 2009. He was the technical editor of Value-creating Network sub-working group of New Generation Network Forum, Japan from 2009 to 2010. He was in IBM Yamato Software Laboratory from 1992 to 2003. His research interests are information retrieval, databases, data mining, and software engineering. He is a member of IPSJ, DBSJ and ACM.

Talk 3:       10:40am - 11:05am, Aug. 15, 2013

Speaker:   Prof. Wenwu Zhu, Tsinghua University

Title:       Social-Sensed Multimedia Computing

Slide:         Pics:

Abstract:     Most multimedia applications try to deliver multimedia content to end users according to their information needs. Thus, multimedia computing actually plays the role of a bridge between multimedia data and user needs. In past years, however, the multimedia research community mostly focuses on multimedia content analysis and understanding, while the user needs over multimedia data are rarely researched, which results in the Intention Gap problem. This talk will first present the social-sensed multimedia computing concept and framework to inject social factors into traditional multimedia computing to ultimately bridge the intention gap. Then we will show several case studies of social sensed multimedia computing, including social-sensed image search, recommendation, and video replication.

Biography:     Wenwu Zhu is with Computer Science Department of Tsinghua University as national Professor of “1000 People Plan” of China. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and the Director at Intel Research China from 2004 to 2008. He worked at Bell Labs as Member of Technical Staff during 1996-1999 in New Jersey.
Wenwu Zhu is a Fellow of the IEEE and ACM Distinguished Scientist. He has published over 200 referred papers and is inventor or co-inventor of over 40 patents. His current research interests are in the area of multimedia communications and networking, including multimedia cloud computing, social media computing, and wireless multimedia communications. He served(s) on various editorial boards, such as Guest Editor for the Proceedings of the IEEE, IEEE T-CSVT, IEEE JSAC, and IEEE Wireless Communication; Associate Editor for IEEE Transactions on Mobile Computing, IEEE Transactions on Multimedia, and IEEE Transactions on Circuits and Systems for Video Technology. He received the Best Paper Award in ACM Multimedia 2012, the Best Paper Award in IEEE Transactions on Circuits and Systems for Video Technology in 2001, and the other 3 Best Paper Awards. He is the past Chair of Visual Signal Processing and Communication Technical Committee of IEEE Circuits and Systems Society (2004-2008), and served in the Steering Committee of IEEE Transactions on Mobile Computing (2007-2010). He served as TPC Co-Chair of IEEE ISCAS 2013, and will serve as TPC Co-Chair for ACM Multimedia 2014.
Wenwu Zhu received the M.S. degree from Illinois Institute of Technology, Chicago, and the Ph.D. degree from Polytechnic Institute of New York University, New York, in 1993 and 1996, respectively, in Electrical and Computer Engineering.

Talk 4:       11:05am - 11:30am, Aug. 15, 2013

Speaker:   Prof. Fei Wu, Zhejiang University

Title:       Cross-media retrieval, hashing and ranking

Slide:         Pics:

Abstract:     Nowadays, many real-world applications involve multimodal data. Cross-media retrieval is imperative to many applications of practical interest, such as finding relevant textual documents of a tourist spot that best match a given image of the spot or finding a set of images that visually best illustrate a given text description. However, the heterogeneity-gap between multi-modal data has been widely understood as a fundamental barrier to successful cross-media retrieval. In this talk, I will introduce three recent works in cross-media retrieval: a) Supervised coupled dictionary learning with group structures for Multi-Modal retrieval: this method utilizes the class information to jointly learn discriminative multi-modal dictionaries as well as mapping functions between different modalities for cross-media retrieval; b)Sparse Multi-Modal Hashing: this method obtains the sparse code sets for the data objects across different modalities via joint multi-modal dictionary learning and therefore expedites the ANN search for cross-media data; c) Cross-Media Semantic Representation via Bi-directional Learning to Rank: this method considers learning a cross-media representation model from the perspective of optimizing a list-wise ranking problem while taking advantage of bi-directional ranking examples.

Biography:     Fei Wu received his B.Sc., M.Sc. and Ph.D. degrees in computer science from Lanzhou University, University of Macau and Zhejiang University in 1996, 1999 and 2002 respectively. From October, 2009 to August 2010, Fei Wu was a visiting scholar at Prof. Bin Yu's group, University of California, Berkeley. Currently, He is a full professor at the college of computer science, Zhejiang University. He is the vice-director of institute of artificial intelligence of Zhejiang University and the vice-director of Key Laboratory of Visual Perception (Zhejiang University) , Ministry of Education and Microsoft. He serves as the PC member of ACM Multimedia 2012, 2013. He was awarded the Program for New Century Excellent Talents in University (NCET) by the Ministry of Education in 2012. His research interests mainly include multimedia retrieval, sparse representation and machine learning.

Talk 5:       11:30am - 11:55am, Aug. 15, 2013

Speaker:   Prof. Rongrong Ji, Xiamen University

Title:       Mobile-end social multimedia computing and analytics

Slide:         Pics:

Abstract:     In this talk, I will present my recent works in mobile and social multimedia computing and analytics. First, I will present a recent advance in compact descriptor for visual search, together with its related MPEG Standard process. A low-bit-rate mobile location search system will be shown, with the functionality to extract compact features efficiently on the mobile end. The key design lies in developing a low-rank based feature compression scheme enabling extracting salient landmark regions from the touristic photos and discarding the backgrounds. Second, I will present a client-end multi-user augmented reality demo, with the ability to simultaneously reconstructing and labeling 3D point clouds using only several mobile devices. This technique can be integrated into the cutting edge mobile-end augmented reality systems like Google Glass. Third, I will introduce our recent work in social multimedia sentiment analytics. In this direction, I will introduce a large-scale visual sentiment ontology design, as well as a middle-level visual attribute descriptor called SentiBank. I will show its effectiveness in predicting sentiments from image and text for Twitter hashtags.

Biography:     Dr. Rongrong Ji is currently a Professor at Xiamen University since 2013, where he directs the Intelligent Multimedia Technology Laboratory (, and serves as a Dean Assistant in the School of Information Science and Technology. Before that, he used to be a Postdoc research fellow in the Department of Electrical Engineering, Columbia University from 2010 to 2013, worked with Professor Shih-Fu Chang. He obtained his Ph.D. degree in computer science from Harbin Institute of Technology. He had been a visiting student at University of Texas of San Antonio worked with Professor Qi Tian, and a research assistant at Peking University worked with Professor Wen Gao in 2010, a research intern at Microsoft Research Asia, worked with Dr. Xing Xie from 2007 to 2008.
He is the author of over 40 tired-1 journals and conferences including IJCV, TIP, TMM, CVPR, IJCAI, AAAI, and ACM Multimedia. His research interests include image and video search, content understanding, mobile visual search, and social multimedia analytics. Dr. Ji is the recipient of the Best Paper Award at ACM Multimedia 2011, Microsoft Fellowship 2007, and Best Thesis Award of Harbin Institute of Technology. He is a guest editor for IEEE Multimedia Magazine, Neurocomputing, and ACM Multimedia Systems Journal. He has been a special session chair of MMM 2014, VCIP 2013, MMM 2013 and PCM 2012. He serves as reviewers for IEEE TPAMI, IJCV, TIP, TMM, CSVT, TSMC A\B\C and IEEE Signal Processing Magazine, etc. He is in the program committees of over 10 top conferences including CVPR 2013, ICCV 2013, ECCV 2012, ACM Multimedia 2013-2010, etc.

Talk 6:       1:30pm - 1:55pm, Aug. 15, 2013

Speaker:   Prof. Shuqiang Jiang, ICT, CAS

Title:       Computing Visual Similarity with Social Context

Slide:         Pics:

Abstract:     Due to the popularity of digital camera and the social networking, photographs and videos can be easily produced by ordinary users and shared online. However, facing with explosively growing web images and videos, we cannot effectively retrieve and utilize the web image database without effective data mining tools, where visual similarity plays a very important role. Traditionally, image similarity is measured by the mathematical distance of visual descriptors. However, visual descriptor could not fully represent the original image. There exist a big gap between human’s recognition and digital computation. Moreover, visual similarity is not consensus among users. Recently, social media for social interaction have developed into very important information on the web. In this talk, I will discuss the problem of visual similarity computation with social context and introduce two techniques. The first one is using social tags, which learns the distance metrics using multiple features and tags. The other is to compute visual similarity with hierarchical semantic relations, which embeds the semantic relation into the distance metric learning framework.

Biography:     Shuqiang Jiang is an associate professor with the Institute of Computing Technology, Chinese Academy of Sciences, Beijing. He is also with the Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences. He research interests include multimedia processing and semantic understanding, pattern recognition, and computer vision. He has authored or coauthored more than 100 papers on the related research topics. Dr. Jiang was supported by the New-Star program of Science and Technology of Beijing Metropolis in 2008. He won the Lu Jiaxi Young Talent Award from Chinese Academy of Sciences in 2012, and the CCF Award of Science and Technology in 2012. He is the senior member of IEEE, member of ACM, CCF, and YOCSEF. He has served as the guest editor of the special issues for PR and MTA. He is the program chair of ICIMCS2010, special session chair of PCM2008, ICIMCS2012, area chair of PCIVT2011, publicity chair of PCM2011 and proceedings chair of MMSP2011. He has also served as a TPC member for more than 20 well-known conferences, including ACM Multimedia, CVPR, ICCV, ICME, ICIP, and PCM.

SANG Jitao

Talk 7:       1:55pm - 2:20pm, Aug. 15, 2013

Speaker:   Dr. Jitao Sang, CASIA

Title:       User-centric Social Multimedia Computing


Abstract:     Social multimedia has three fundamental elements of user, multimedia content and interaction. Social multimedia computing aims to connect user and multimedia content by analyzing the interaction, and accordingly addresses three problems of multimedia content understanding, user modeling, and social network analysis.    In this talk, I will introduce the user-centric social multimedia computing framework. Firstly, user is the basic data collection unit. User has revolutionized their roles from information receivers to information contributors. Massive User-Generated Content (UGC) has provided available training examples for multimedia content understanding. Secondly, user is the ultimate information service target. Information service in social media is essentially user-oriented. Inferring user interests, understanding and satisfying their personalized needs are the major tasks for information service. Based on this framework, we have conducted work along four research lines: (1) User-aware multimedia content understanding, (2) online activity-based user modeling, (3) topic-level user relation analysis, and (4) common user-based cross-network collaboration.
At last, I will conclude this talk with some prospects into social multimedia computing directions. With the development of mobile Internet and wearable technology, user serves as roles to connect cyber to the physical worlds, and will be the fundamental computing terminal.

Biography:     Dr. Jitao Sang is currently an assistant professor in the Institute of Automation, Chinese Academy of Sciences (CASIA) since 2012. He obtained his Ph.D. degree in computer science from CASIA. He had been a research intern at Microsoft Research Asia, worked with Dr. Tao Mei in 2011, and a research assistant at China-Singapore Institute of Digital Media, worked with Prof. Changsheng Xu in 2010.
His research interest includes multimedia retrieval and social media analysis. He has been the recipient of several awards, include Best Paper Candidate in ACM Multimedia 2012 & 2013, Best Student Paper in MMM 2013, Special Prize of CAS President Scholarship and Best Thesis Award of CAS. He has organized special sessions and special issues in MMM 2013, ICIMCS 2013 and Multimedia System Journal. More details can be found on his personal webpage in

Talk 8:       2:20pm - 2:45pm, Aug. 15, 2013

Speaker:   Dr. Tao Mei, Microsoft Research Asia

Title:       Deep Understanding of Users for Socio-Mobile Recommendation

Abstract:     People are contributing massive social media contents for knowledge sharing anytime and anywhere. Deep understanding users from heterogeneous, complex, and dynamic social media therefore becomes important for socio-mobile applications. There are four elements in the loop of understanding a user: user profile, context, multi-modal input, and interactivity. This talk will introduce recent advances for understanding these key elements by leveraging machine intelligence and natural user interaction on mobile devices. Specifically, we will introduce: 1) how to accurately and comprehensively estimate a mobile user's geo-context from the phone-captured phone, 2) how to leverage the estimated context information for personalized mobile recommendation, and 3) how to analyze multi-modal user input for cross-media recommendation. We will also showcase recently developed applications of this kind.

Biography:     Tao Mei is a Researcher with Microsoft Research Asia, Beijing, China. He received the B.E. degree in automation and the Ph.D. degree in pattern recognition and intelligent systems from the University of Science and Technology of China, Hefei, China, in 2001 and 2006, respectively. His research interests include multimedia information retrieval and computer vision. He has authored or co-authored over 100 papers in journals and conferences in these areas. He holds five U.S. granted patents and more than 30 in pending. He is a Senior Member of the IEEE and the ACM.

Talk 9:       9:30am - 10:00am, Aug. 16, 2013

Speaker:   Prof. Thomas Plagemann, University of Oslo

Title:  From multimodal-sensing to complex event detection in health applications

Slide:         Pics:

Abstract:     In this talk we take a look into the future possibilities that modern sensors will provide for health applications and the challenges that need to be addressed. To simplify the development of the applications we propose to use complex event processing on multimodal data streams. We will take a closer look into two applications, i.e., heart attack prediction and Ambient Assistant Living and discuss the challenges, our approach and present some results.

Biography:     Thomas Plagemann is Professor at the University of Oslo since 1996. Currently, he leads the research group in Distributed Multimedia Systems at the Department of Informatics. He has a Dr.SC degree from Swiss Federal Institute of Technology (ETH) in 1994 and received in 1995 the Medal of the ETH Zurich for his excellent Dr.Scient thesis. He has published over 150 papers in peer reviewed journals, conferences and workshops in his field. He serves as Associate Editor for ACM Transactions of Multimedia Computing, Communications and Applications and as Editor-in-Chief for the Springer Multimedia Systems Journal.

Group Allocation

Group 1:   Ramesh Jain, Koji Zettsu, Changsheng Xu, Tao Mei, Zhengjun Zha, Jing Liu, Peng Cui, Thomas Plagemann,                    Bingkun Bao

Group 2:   Siripen Pongpaichet, Shuqiang Jiang, Rongrong Ji, Yue Gao, Eric Huo, WeiZhe Ni, Minh-Son Dao, Jitao Sang

Intro. to attendees


All slides


All pictures