Mining Cross-network Association for YouTube Video Promotion
Ming Yan, Jitao Sang and
Changsheng Xu*
Summary
(1) Large quantities of videos are consumed in YouTube and the trend is growing year by year.
(2) YouTube exhibits limited propagation efficiency and many videos remain unknown to the wide public due to the limited internal mechanism.
(3) External referrers such as social media websites arise to be important sources to lead users to YouTube videos, among which Twitter has grown to be the top referrer recently.
(4) The followee-follower and user-centric architecture has distinguishd Twitter with significant information propagation efficiency.
(5) Our motivation in this work is: For specific YouTube video, to identify proper Twitter followees with goal to maximize video dissemination to the followers.
Framework
Two challenges are mainly concerned with our problem:
(1) The heterogeneous knowledge association between YouTube video and Twitter followee;
(2) How to define the “properness” of candidate Twitter followee for a specific YouTube video.
To address the challenges one by one, we propose a three-stage framework as our solution:
(1) Heterogeneous Topic Modeling: To discover the latent structure within YouTube video and Twitter user spaces, respectively.
(2) Cross-network Topic Association: To address the discrepancy issue between the heterogeneous YouTube video and Twitter user spaces by mining cross-network topic association on a collective user-level.
(3) Referrer Identification:To define the “properness” of candidate Twitter followee for a specific YouTube video and match video to followee in a ranking-based method.
Heterogeneous Topic Modeling
In this stage, YouTube video and Twitter user spaces are learnt in their own social networks by employing specific generative topic models, respectively.
(1) YouTube Video Topic Modeling: The video topics are expected to span over both textual and visual spaces. We introduce a modification to the multi-modal topic model, Corr-LDA, as depicted in the figure.
(2) Twitter Followee Topic Modeling: Since the properness of Twitter followee is decided by the followers, we represent each Twitter user (document) with all his/her followees (words) and apply the standard LDA on the user social graph for topic modeling.
Cross-network Topic Association (Details pls refer to the paper.)
In this stage, we propose a solution that first aggregates YouTube video distribution to user level, and then exploit the overlapped users as bridge for association mining.
Three kinds of alternative methods are devised for association mining. The first two methods assume a linear cross-network association matrix exists between twitter and youtube user distributions and the main focus of these methods lies in the derivation of this association matrix A. Nevertheless, the third one does not hold this assumption and it assumes that the distinctive user distributions in different social networks result from some latent but unique user attribute S.
(1) Transition Probability-based Association (TP)
(2) Regression-based Association
(3) Latent Attribute-based Association (LA)
Referrer Identification
Given a test video with its YouTube topic distribution, we can obtain its transferred Twitter topic distribution from stage 2. In this third stage, we are devoted to matching the YouTube video with Twitter followee in the same Twitter topic space in a ranking svm-based scheme. Two critical issues should be addressed when utilizing the ranking-svm training scheme:
(1) How to extract the video-followee pair features? : We define the video-followee pair features as the vector product between the video and followee distributions.
(2) How to define the Ground-Truth (GT) properness for each video-followee pair in the training set?
where U_v is the set of users showing interest in v, U^followee_u is the follower set of u.
Experiments
Discovered Topic Illustration (in YouTube and Twitter spaces, respectively)
Quantitative Evaluation Results
Evaluation results with different baselines for stage 2 and stage 3 are shown in the figure below. (a) demonstrates the advantage of latent attribute-based methods and the consideration of both overlapped users and non-overlapped users. (b) validates our motivation that more accurate distribution transfer function and ranking-based supervised scheme contributes to better application performance.
Publication
Mining Cross-network Association for YouTube Video Promotion [pdf]
[slides]
Ming Yan, Jitao Sang and Changsheng Xu
In ACM Multimedia 2014 (MM), Orlando, Florida, USA.