+ All Categories
Home > Documents > An Empirical Analysis of User Content Generation and...

An Empirical Analysis of User Content Generation and...

Date post: 03-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
21
MANAGEMENT SCIENCE Vol. 57, No. 9, September 2011, pp. 1671–1691 issn 0025-1909 eissn 1526-5501 11 5709 1671 http://dx.doi.org/10.1287/mnsc.1110.1350 © 2011 INFORMS An Empirical Analysis of User Content Generation and Usage Behavior on the Mobile Internet Anindya Ghose, Sang Pil Han Stern School of Business, New York University, New York, New York 10012 {[email protected], [email protected]} W e quantify how user mobile Internet usage relates to unique characteristics of the mobile Internet. In par- ticular, we focus on examining how the mobile-phone-based content generation behavior of users relates to content usage behavior. The key objective is to analyze whether there is a positive or negative interdepen- dence between the two activities. We use a unique panel data set that consists of individual-level mobile Internet usage data that encompass individual multimedia content generation and usage behavior. We combine this knowledge with data on user calling patterns, such as duration, frequency, and locations from where calls are placed, to construct their social network and to compute their geographical mobility. We build an individual- level simultaneous equation panel data model that controls for the different sources of endogeneity of the social network. We find that there is a negative and statistically significant temporal interdependence between con- tent generation and usage. This finding implies that an increase in content usage in the previous period has a negative impact on content generation in the current period and vice versa. The marginal effect of this inter- dependence is stronger on content usage (up to 8.7%) than on content generation (up to 4.3%). The extent of geographical mobility of users has a positive effect on their mobile Internet activities. Users more frequently engage in content usage compared to content generation when they are traveling. In addition, the variance of user mobility has a stronger impact on their mobile Internet activities than does the mean. We also find that the social network has a strong positive effect on user behavior in the mobile Internet. These analyses unpack the mechanisms that stimulate user behavior on the mobile Internet. Implications for shaping user mobile Internet usage behavior are discussed. Key words : mobile Internet; social networks; content generation; content usage; interdependence; geographical mobility; identification History : Received September 10, 2009; accepted February 9, 2011, by Pradeep Chintagunta and Preyas Desai, special issue editors. Published online in Articles in Advance June 20, 2011. 1. Introduction Rapid advances in mobile Internet technologies now allow consumers to interact, create, and share content based on physical location. Such ubiquitous access to the mobile Internet also provides companies with new marketing opportunities. Newer marketing strategies need to be implemented in such an environment. This change requires a deeper understanding of user behavior on the mobile Internet. However, little is known as yet about how user mobile Internet usage relates to certain unique characteristics of the mobile Internet space. We examine that topic in this paper. There are a few aspects of mobile Internet that dis- tinguish it from other types of Internet access via devices like personal computers (PCs) or laptops. First, in many countries, users incur explicit expenses (for example, by paying usage-based data transmis- sion charges) during their mobile Internet usage. These are based on the number of bytes uploaded or downloaded. 1 This feature is in contrast to using PCs where the Internet can be accessed using a fixed connection or a WiFi connection without incurring any monetary costs based on usage. Second, users can access the Internet via mobile devices anytime and anywhere, subject to signal reception. In con- trast, PCs render stricter limitations on geographical mobility and access, typically constraining it to office or home or locations where there is access in place. Third, screen sizes are smaller on mobile devices com- pared to PCs, thereby rendering higher search costs for mobile devices. Usage-based data pricing and ubiquitous access can lead to a situation where a diverse set of factors, such as resource constraints (time or money), geo- graphical mobility, and social networks will influence 1 The fee structure in our empirical context is usage-based pricing. Subscribers are charged on a per-byte basis for data traffic they generate through content uploading and/or downloading. 1671
Transcript
Page 1: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

MANAGEMENT SCIENCEVol. 57, No. 9, September 2011, pp. 1671–1691issn 0025-1909 �eissn 1526-5501 �11 �5709 �1671 http://dx.doi.org/10.1287/mnsc.1110.1350

© 2011 INFORMS

An Empirical Analysis of User Content Generationand Usage Behavior on the Mobile Internet

Anindya Ghose, Sang Pil HanStern School of Business, New York University, New York, New York 10012

{[email protected], [email protected]}

We quantify how user mobile Internet usage relates to unique characteristics of the mobile Internet. In par-ticular, we focus on examining how the mobile-phone-based content generation behavior of users relates

to content usage behavior. The key objective is to analyze whether there is a positive or negative interdepen-dence between the two activities. We use a unique panel data set that consists of individual-level mobile Internetusage data that encompass individual multimedia content generation and usage behavior. We combine thisknowledge with data on user calling patterns, such as duration, frequency, and locations from where calls areplaced, to construct their social network and to compute their geographical mobility. We build an individual-level simultaneous equation panel data model that controls for the different sources of endogeneity of the socialnetwork. We find that there is a negative and statistically significant temporal interdependence between con-tent generation and usage. This finding implies that an increase in content usage in the previous period has anegative impact on content generation in the current period and vice versa. The marginal effect of this inter-dependence is stronger on content usage (up to 8.7%) than on content generation (up to 4.3%). The extent ofgeographical mobility of users has a positive effect on their mobile Internet activities. Users more frequentlyengage in content usage compared to content generation when they are traveling. In addition, the variance ofuser mobility has a stronger impact on their mobile Internet activities than does the mean. We also find that thesocial network has a strong positive effect on user behavior in the mobile Internet. These analyses unpack themechanisms that stimulate user behavior on the mobile Internet. Implications for shaping user mobile Internetusage behavior are discussed.

Key words : mobile Internet; social networks; content generation; content usage; interdependence; geographicalmobility; identification

History : Received September 10, 2009; accepted February 9, 2011, by Pradeep Chintagunta and Preyas Desai,special issue editors. Published online in Articles in Advance June 20, 2011.

1. IntroductionRapid advances in mobile Internet technologies nowallow consumers to interact, create, and share contentbased on physical location. Such ubiquitous access tothe mobile Internet also provides companies with newmarketing opportunities. Newer marketing strategiesneed to be implemented in such an environment.This change requires a deeper understanding of userbehavior on the mobile Internet. However, little isknown as yet about how user mobile Internet usagerelates to certain unique characteristics of the mobileInternet space. We examine that topic in this paper.

There are a few aspects of mobile Internet that dis-tinguish it from other types of Internet access viadevices like personal computers (PCs) or laptops.First, in many countries, users incur explicit expenses(for example, by paying usage-based data transmis-sion charges) during their mobile Internet usage.These are based on the number of bytes uploaded

or downloaded.1 This feature is in contrast to usingPCs where the Internet can be accessed using a fixedconnection or a WiFi connection without incurringany monetary costs based on usage. Second, userscan access the Internet via mobile devices anytimeand anywhere, subject to signal reception. In con-trast, PCs render stricter limitations on geographicalmobility and access, typically constraining it to officeor home or locations where there is access in place.Third, screen sizes are smaller on mobile devices com-pared to PCs, thereby rendering higher search costsfor mobile devices.

Usage-based data pricing and ubiquitous access canlead to a situation where a diverse set of factors,such as resource constraints (time or money), geo-graphical mobility, and social networks will influence

1 The fee structure in our empirical context is usage-based pricing.Subscribers are charged on a per-byte basis for data traffic theygenerate through content uploading and/or downloading.

1671

Page 2: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1672 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

user mobile Internet behavior. Advanced mobile tech-nologies, such as third-generation (3G) services, allowpeople to build and maintain their social relationshipsthrough cocreation and joint usage of content. Hence,people must decide how much content to generateand how much content to use. For example, in a givenweek, the higher the amount of time or money spentby a user to use content (e.g., downloading music,games, productivity apps, etc.), the lower the amountof time or money left for that user to generate con-tent (e.g., uploading photos, reviews, videos, etc.).2

In contrast to this scenario, higher levels of contentconsumption by users can motivate higher levels ofcontent contribution. This can occur when users beginto feel they are an integral part of the communitiesestablished on these sites and hence engage in recip-rocal behavior.

Either way, content generation and usage maynot be independent decision-making processes atall. Whether the former negative interdependence isstronger than the latter positive one is an empiri-cal question we aim to examine here. Keeping theseissues in mind then, in this paper we focus on exam-ining how user content generation behavior relatesto user content usage behavior over time. Is there apositive or negative interdependence between thesetwo activities? What other factors, if any, affect usercontent generation and usage activities on the mobileInternet?

We examine these questions by applying a uniquedata set that consists of mobile data across a panelof users, encompassing both their content gener-ation and usage behavior. That data set consistsof 2.34 million individual-level mobile data recordsacross 180,000 users. We also use data on voice callsmade by the same users, which allows us to drawtheir social networks. We include detailed user demo-graphics (age and gender) and geographical data,including the location from where a call is placed.This location information helps us impute the extentof user geographical mobility by mapping the dif-ferent places they visit. We then construct two dif-ferent measures of mobility to map both the meanand the variance of user travel patterns at both localand national levels, thereby giving us four differentmobility metrics. Our analysis utilizes simultaneousequation models as well as generalized method ofmoments (GMM)-based dynamic panel models andseveral other models (both linear and nonlinear) forrobustness checks.

There are three key sets of results. First, we findthat there exists a negative temporal interdependence

2 We use the terms “content generation” and “content uploading”interchangeably in this paper. Similarly, we use the terms “contentusage” and “content downloading” interchangeably as well.

between content generation and usage behavior onthe mobile Internet. This finding implies that theresource constraint (e.g., time and money constraint)is binding at least for some users. The effect is asym-metric, such that the negative impact of previousperiod content generation on current period contentusage is much higher than the opposite. Second, theextent of geographical mobility of users positivelyaffects their mobile Internet activities. Users more fre-quently engage in content downloading compared tocontent uploading when they are traveling. In addi-tion, variance of user travel patterns has a strongerimpact on mobile Internet activities than does themean. Third, mobile Internet usage behavior of socialnetwork neighbors positively influences an individualuser’s mobile Internet usage behavior.

Our paper aims to makes a few key contributionsto the literature. We are the first to simultaneouslymodel and estimate the drivers of user content gen-eration and usage behavior in the mobile Internetspace and the nature of the interdependence betweenthese processes. We build and estimate an individual-level simultaneous equation panel data model usingthree-stage least-squares (3SLS) estimation. We fur-ther demonstrate the robustness of this analysis byconducting GMM-based dynamic panel data analy-ses and other analyses that include models with arandom coefficient for a constant term, count datamodels, content share models, content size models,and models using different specifications of social net-work variables. Our paper is among the first in theemerging literature to research the dual role (con-tent creation and consumption) played by users insocial media settings. Second, the unique nature ofour data allows us to map how the mean and the vari-ance of user geographical mobility at both local andnational levels actually drive their content generationand usage behavior. We are the first to provide novelinsights into how location plays a role in user behav-ior on the mobile Internet. This paper can thus serveas the foundation for future research on both the eco-nomic and the social impact of the mobile Internet.Third, our model unpacks the causal mechanisms thatstimulate user behavior on the mobile Internet in thepresence of social networks. Mobile-phone-based datagenerate unique kinds of social networks because ofthe inherent dynamics present in communication andtravel patterns of users; these patterns make thesesocial networks distinct from various other kinds ofsocial networks already studied in the existing lit-erature. Our model distinguishes and controls forthe different sources of endogeneity in this mobilesetting, such as endogenous group formation, corre-lated unobservables, and simultaneity. Hence, anotherkey contribution of this paper is to provide a pre-cise empirical framework for resolving identificationissues in social networks now and in the future.

Page 3: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1673

2. Literature ReviewOur paper is related to a small group of literature thatdiscusses the interplay between user Internet usages,mobility, and social networks.

First, we relate the dynamic interdependencebetween user content generation and usage behav-iors to two relevant streams of literature—economicbehavior under resource constraints and reciprocitystemming from social exchange theory. Researchershave long recognized that time acts as a constraint(Becker 1965, Jacoby et al. 1976). In online settings,users need to allocate their resources between con-tent generation and content usage activities becauseusers can take on the dual role of creators as wellas consumers (Ghose and Han 2009, Trusov et al.2010, Albuquerque et al. 2010, Ghose et al. 2011). Onthe mobile Internet, not only do users need to investtime, but also incur explicit transmission charges togenerate and use content in certain countries. Thissuggests that there is a negative temporal interdepen-dence between content generation and content usageactivities over time. In contrast, the prior work inthe online content sharing literature (Xia et al. 2011)draws on reciprocity stemming from social exchangetheory (Homans 1958) and suggests that the more auser benefits from the contributions of other users,the more that user is willing to create and share con-tent (Xia et al. 2011). This behavior suggests that thereis a positive temporal interdependence between con-tent generation and content usage activities over time.Because the extent of reciprocal interactions in themobile Internet setting is largely unknown, the over-all extent and directional interdependence of the tem-poral effect between content generation and usageremains still an intriguing empirical question.

Second, we examine the impact of the extent ofthe geographical mobility of a user on the mobileInternet activity of that same user. There are two pos-sible scenarios. The first is that the more a user travels,the more travel-related discretionary time the user islikely to have. Shim et al. (2008) analyze mobile usagepatterns where people view television programs ontheir phone screens. They find that the highest usageoccurs between 6 a.m. and 9 a.m. in the morning andbetween 6 p.m. and 8 p.m. in the evening, which is con-sistent with the notion that most users view contentusing mobile phones while commuting from home towork and back. O’Hara et al. (2007) find that peo-ple use mobile video content to pass time, managesolitude, and disengage from others. In contrast, it ispossible that mobile Internet usage can occur at geo-graphically fixed places as well. Hence, the overallextent of the impact of a user’s geographical mobilityon that same user’s propensity to engage in mobileInternet activity remains still an empirical question ofinterest.

Third, users can be influenced by others with whomthey communicate. There is some evidence of thisinfluence in the business world, such as adoption ofnew services and products (Hill et al. 2006, Tucker2008, Aral et al. 2009, Nair et al. 2010, Nam et al. 2010,Iyengar et al. 2011, Oestricher-Singer and Sundarara-jan 2010), switching from an existing service provider(Dasgupta et al. 2008), and diffusion of user-generatedcontent in online space (Susarla et al. 2011). Therefore,a user’s mobile Internet activities can be influencedby the mobile Internet activity of their peers.

3. Data Description and Basic PatternsIn this section, we provide a short overview of themobile Internet service found in our data, describethe data that we obtained from a large telecommuni-cations service company in South Korea, and finallyprovide the basic patterns seen in those data thatmotivated our subsequent model development.

3.1. Data SourceOur sample consists of 2.34 million mobile datarecords from 180,000 3G mobile users who used theservices of a particular company between March 15,2008, and June 15, 2008. The South Korea 3G mobilemarket had over 10 million subscribers in June 2008.3G mobile services enable users to upload and down-load their content faster than conventional mobile ser-vices. These services are more commonly available onlarger-screen handsets (i.e., smart phones).

There are two broad categories of websites thatusers can access through their mobile phones asdemonstrated in our data. The first category is regu-lar social networking and community websites. Exam-ples of such websites in our data include Cyworldand Facebook. The second category of websitesincludes portal sites specifically created by mobilephone service carriers. Examples include Nate Portaland KTF Portal, the Asian equivalents of U.S. sites likeVodafone live and T-Mobile’s Web “n” Walk. Contenton these sites can be accessed via a mobile phone byusers who subscribe to the services of their mobileoperator. Like social networking sites, these mobileportals are community-oriented sites that allow usersto download and upload (to share with others) mul-timedia content like photos, music, videos, apps, etc.The transmission charges are the same in our data,irrespective of whether users upload or downloadcontent and whether users access social networkingcommunity sites or mobile portal sites. We measurethe level of user mobile Internet activities based onthe frequency of content generation and usage.3

3 In our robustness checks, we also use the amount of bytes trans-mitted via user content uploading and downloading as an alterna-tive measure for the level of user mobile Internet activity.

Page 4: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1674 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

Table 1 Summary Statistics

Variable Observations Mean Std. dev.

Weekly, user-specific content activity dataNumber of Mobile Internet Session Activation 213401000 400 41038Number of Uploading 6101809 0027 3054Number of Downloading 6101809 22057 86080

Weekly, user-specific call dataNumber of Calls Made 9001000 11088 16032Call Duration (hours) 9001000 2061 5068

Weekly, user-specific geographical dataMean Local Mobility 9001000 14004 13018Mean National Mobility 9001000 5091 2088Local Mobility Dispersion 9001000 2085 3083National Mobility Dispersion 9001000 0070 0080

User characteristicsAge 1801000 30013 5091Sex (1= male, 0 = female) 1801000 0053 0050Handset Age (months) 1801000 9063 3097

Note. We observe content generation and usage data only when a user starts mobile Internet sessions; thus thenumber of uploading and the number of downloading are lower than the number of sessions.

3.2. Variable DescriptionOur mobile Internet data include individual-levelinformation on user content generation and usageactivities over time. The temporal unit in our analy-sis is a “week.” Technically, a unique mobile Internetsession starts when a user pushes a button on a key-pad or clicks an icon on a touchpad, and it ends whenthe user deactivates that mobile Internet session. Onlywhen a user initiates a mobile Internet session can theuser either download content or upload content, ordo both. An activity involving more than zero bytesof data transmission is an event. One mobile Internetsession usually consists of multiple events. As shownin Table 1, a user content usage occurs far more fre-quently than user content generation.

We construct four metrics with respect to user geo-graphical mobility: (1) mean local mobility, (2) meannational mobility, (3) local mobility dispersion, and(4) national mobility dispersion. The first two metricscapture the mean of user travel patterns, and the lasttwo metrics capture the variance of their travel pat-terns. Users often engage in mobile content activitieswhen they are outdoors and when they are trav-eling. In such circumstances, the geographical loca-tions from where their call is placed will change overtime. We refer to the number of unique locations fromwhere calls are placed by a user as a measure oftheir mean mobility. In a sense, this variable capturesthe mean travel pattern of a user. There are two dif-ferent levels of granularity regarding the extent ofmobility of each user, namely, local and national. Bothvariables measure the number of distinct locationsfrom where a user makes calls at the zip code leveland the province/state level, respectively. Because thetotal number of unique zip codes is much higher than

the number of provinces or states gathered in ourdata (i.e., 30,116 versus 16), the number of distinctzip-code-level locations thus corresponds to a user’smean local mobility, whereas the number of distinctprovince-level locations corresponds to a user’s meannational mobility.

The concept of mobility dispersion measures theextent of geographical deviation from one’s com-monly visited places (i.e., home and office), and inthat sense, the term captures the variance of a user’stravel patterns. A user’s mobility dispersion refersto a fraction of uncommonly visited places com-pared to the total number of places visited duringa given week.4 The notion behind the use of thisvariable is that deviations from routine patterns oftravel might indicate a trip to a location that isdifferent from a user’s regular travel. Such uniquetravel occasions could lead to a higher propensity(compared to the mean) to upload and download toshare travel experiences (i.e., sharing photos taken attourist attractions). As before, we have both local andnational levels of granularity to apply for the mobilitydispersion metric.

We also gather data on voice calls made by thesame users, which enables us to draw their socialnetworks. Voice call records contain a caller’s tele-phone number and the receiver’s telephone number,call duration, and frequency. Our voice call data canhelp us identify an exogenously defined network ofsocial neighbors because we do not use mobile Inter-net activity data per se to construct the network.

4 We define commonly visited places as those places where a uservisits at least once every week. Hence, we define an uncommonlyvisited place as one a user does not visit every week.

Page 5: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1675

Social network variables in our sample correspond tothe level of content activity (i.e., frequencies of con-tent uploads and downloads) of network neighborsfor each user in the sample.

We use social network data to capture a sourceof learning based on word of mouth as identifiedin prior work (for example, Ghose and Han 2009).Mobile-phone-based data generate unique kinds ofsocial networks because of the inherent dynamics ofcommunication and mobility patterns, which makethem distinct from other kinds of social networksstudied in the existing literature. There are four pos-sible types of user behavior that can accrue from theinterplay of mobility and calling patterns—(a) lowmobility and infrequent calls to people at travel des-tinations, (b) high mobility and infrequent calls topeople at travel destinations, (c) low mobility andfrequent calls to people at travel destinations, and(d) high mobility and frequent calls to people attravel destinations. The inclusion of social networkand mobility variables in the model helps us capturemore precisely all four types of user behavior.

Finally, we have data on demographics like age,gender, and product characteristics such as handsetage. The summary statistics of these key variablesused are provided in Table 1.

3.3. Basic Patterns in the DataWe now discuss stylized patterns found in the datathat motivate our subsequent econometric modeldevelopment. First, we describe the interdependencein content generation and content usage at both

Figure 1 Aggregate-Level Mobile Internet Activity Frequency Series Plot

80,000

100,000

120,000

1,500

2,000

2,500

Upl

oad

freq

uenc

yD

ownl

oad

freq

uenc

y

1 30 60 90Days

Notes. Vertical dotted lines represent Sundays. Mobile Internet activity during weekdays is generally higher than that during weekends except the nationalholidays (e.g., Day 50, Children’s Day; Day 57, Buddha’s birthday; Day 83, Memorial Day).

the aggregate level and the individual level, whichare linked to the key objective of this paper. Sec-ond, we describe the relationship between mobileInternet session initiation and content downloadingand uploading activities. We also discuss specific evi-dence supporting the need for incorporating variouseconometric issues that we address in our model.

3.3.1. Interdependence Between Content Gener-ation and Content Usage. To have a better sense forhow content generation and usage patterns are associ-ated with each other, we plot the total number of con-tent uploads and content downloads in our sampleover 13 weeks (i.e., 91 days). Figure 1 shows a strongweekly cycle for both content uploads and down-loads; that is, mobile Internet activity during week-days is generally higher than during weekends excepton national holidays. This finding is consistent withprior work that shows that consumers surf on theInternet more during weekdays and regular workinghours than they do during weekends and off-peakhours (Baye et al. 2009). The plotting of content activ-ities also demonstrates similar patterns between thedual time series, implying that content generation andusage are indeed associated at the aggregate level.

Furthermore, we plot individual-level mobile Inter-net use patterns of some of the users in our data.Plot 1 in Figure 2 shows the weekly frequency pat-terns of the content generation and usage of User 9116and then provides evidence that suggests the needto incorporate “simultaneity” into our model. Thisplot shows that uploading frequency highly correlates

Page 6: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1676 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

Figure 2 Individual-Level Mobile Internet Activity Frequency Series Plots

Interdependence between content generation and usage

0

20

40

60

80

100

Fre

quen

cy

Fre

quen

cy

0 5 10 15Week

Plot 1: User 9116

0

20

40

60

Fre

quen

cy

0

20

40

60

0 5 10 15

Week Week0 5 10 15

Plot 2a: User 1927 Plot 2b: User 3186

DownloadUpload

Download

Mobile Internet session initiation and content usage

with downloading frequency for this particular user.This kind of correlated behavior can be driven by sev-eral factors—observed user heterogeneity (i.e., youngusers like both content generation and usage), cor-related unobservables (i.e., users who like to uploadalso like to download, and vice versa), etc. Althougha single-equation panel data model can accommodateobserved user heterogeneity (e.g., by including suchvariables as age and gender in the random effects (RE)specification and by differencing out user-specific,time-invariant characteristics), it cannot address thecorrelated unobservables mentioned above. This fea-ture suggests the need for the use of a “simultaneousequation model” for content generation and usageequations. Furthermore, Plot 1 suggests evidence ofnegative interdependence between content generationand usage, especially during the first nine weeks.

3.3.2. Mobile Internet Session Initiation and Con-tent Activities. Plot 2a and Plot 2b in Figure 2 showthe weekly frequency patterns of content generationand usage of User 1927 and User 3186, respectively.Although these two users have the same demographiccharacteristics (i.e., age and gender), they are very dif-ferent in terms of their propensity to initiate mobileInternet sessions as well as in terms of the num-ber of their content uploads and downloads. Notethat User 1927 in Plot 2a, having a higher frequencyof mobile Internet session initiation (13 times), hasengaged more frequently in content usage when com-pared to User 3186 in Plot 2b, who has a lower propen-sity for initiating a mobile Internet session (2 times).In other words, even after controlling for observeddemographics like age and gender, some unobserveduser characteristics like inherent interest in initiatinga mobile Internet session do explain behavioral differ-ences, such as the content usage between users. Thisfeature motivates the need to incorporate a “selection

constraint” in our model to control for the nonran-domness in the user mobile Internet session initia-tion stage.

Descriptive statistics from our data suggest thatyoung, male users tend more frequently to engagein mobile content generation and usage than othergroups. Thus, there could be a disproportionatelyhigher number of young, male users in this samplefor mobile session-initiating users (i.e., the selectedsample) compared to the total sample. Indeed, resultsfrom a random effect dynamic probit model for theuser session initiation equation support this argument(see Appendix B). Furthermore, if we were to groupthose users who have initiated mobile Internet ses-sions by the amount of their content activities (bothuploads and downloads) and divide the entire groupinto heavy users versus light users, we would seea disproportionately higher number of young, maleusers in the heavy-user sample compared to the light-user sample.

Last, Plots 1, 2a, and 2b indicate that the frequen-cies of generation and usage in week 1 vary by user.For example, User 1927 (Plot 2a) has 56 instances ofdownloading, whereas User 9116 (Plot 1) has about18 instances of downloading. This implies that userscould be different in terms of prior experiences withrespect to content generation and usage at the begin-ning of the sample (i.e., week 1). We thus model this“initial condition” issue by specifying selection equa-tions for week 1 and for weeks 2–13, separately.

4. The Econometric ModelTo analyze the underlying process of user con-tent generation and content usage, we build andestimate an individual-level simultaneous equationspanel data model using 3SLS estimation. We furtherdemonstrate the robustness of this analysis by con-ducting GMM-based dynamic panel data analyses.We also present and discuss other robustness check

Page 7: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1677

analyses that include models with a random coeffi-cient in a constant term, count data models, contentshare models, content size models, and models usingnetwork variables based on different specifications.

In the mobile Internet space where users generateand use content with their phones, users face a two-step decision-making process. In Step 1, they decidewhether to initiate a mobile Internet session by click-ing a button on the mobile phone. In Step 2, once theyhave initiated a mobile Internet session, they deter-mine how much to upload (if any) and how much todownload (if any). They can engage in both upload-ing and downloading activities multiple times duringa given mobile Internet session. Hence, there are threerelated user decisions—(a) mobile Internet session ini-tiation, (b) content generation, and (c) content usage.Conditional on doing (a), the user could decide to do(b), (c), or both.

There are five econometric issues to address here:(i) sample selection bias, (ii) social network endogene-ity, (iii) initial conditions problem, (iv) unobserveduser heterogeneity, and (v) simultaneity.

First, recall that we (researchers) can observe thecontent generation and usage frequency of users onlyif these users initiate their mobile Internet sessions.Sample selection can arise in this setting because thosepeople who more frequently initiate their mobile Inter-net sessions can also be more prone to content gener-ation (or usage) than those who less frequently initi-ate their mobile Internet sessions. If uncorrected, theestimates in the main equations become biased andinconsistent, leading to a misleading inference (Heck-man 1979). Furthermore, it could undermine the exter-nal validity of estimates in that the estimates are onlyrelevant for the selected sample (i.e., people who actu-ally initiated mobile Internet sessions), thereby limit-ing the generalizability of any results. In other words,the main source of the selection bias here is the userdecision to initiate a mobile Internet session based ontheir discretion and intrinsic preferences as opposedto being randomly chosen. We control for this sam-ple selection bias by including a selection correctionterm in our main equations (i.e., content generationand usage frequencies).

Second, the mobile Internet behavior of a user andhis social network can seem to be correlated, regard-less of whether it occurred because of causal influenceor not. To identify the social network effect, we con-trol for three other factors that can lead to spuriouscorrelated behaviors: endogenous group formation,correlated unobservables, and simultaneity. Endoge-nous group formation can arise in our setting if userscommunicate with other users with similar tastes forcontent generation and usage. Correlated unobserv-ables can arise in our setting if marketing activitiesand promotions exist that are targeted to a user andthat user’s network neighbors. Simultaneity can arise

in our setting if network neighbors affect the user andthe user also affects them simultaneously. We explainthese aspects in detail in §4.3.2.

Third, we account for the well-known initial con-ditions problem in our model as follows: (1) for eachuser, the first observation in our sample may not bethe true initial outcome of his mobile content gen-eration and usage process, and (2) the frequenciesof generation and usage in week 1 vary greatly byuser. Fourth, users are, in general, different in termsof their propensities and preferences toward contentgeneration and usage. Whereas some consumers tendto be users of content created by others, others con-tribute by creating and uploading content to webportals and social networking sites. We account forthis phenomenon by incorporating both observed andunobserved user heterogeneity in our model. Finally,as several plots of usage have shown before, therecould be simultaneity between content generation andusage, and we need to account for that as well. Weaddress each of five econometric issues above in thefollowing §§4.1 and 4.2. We discuss the identificationissues involved in the model in §4.3.

4.1. Selection Equations: Mobile InternetSession Initiation

To address the sample selection bias statistically, weexplicitly specify our econometric model by extend-ing Verbeek and Nijman’s (1996) two-step method.5 InStep 1, related to the user’s decision in Step 1, we runan RE dynamic probit model for the user’s binary deci-sion for whether to initiate a mobile Internet session ornot initiate one during a given week. Estimates fromStep 1 are used to obtain a Heckman’s (1979) selec-tion correction term. In Step 2, we insert the correc-tion term into content generation and content usageequations, respectively, and estimate the two equa-tions simultaneously, using a 3SLS method.

To account for social network effects, we use alagged social network variable. To control for ob-served user heterogeneity, we include time-invariantuser-specific variables like age and sex. In addition,to account for the initial condition problem, we spec-ify two separate equations for a user’s mobile Internetsession initiation decision: one for the first time period

5 A single-shot estimation model cannot address all econometricissues involved in our context. This has been documented in priorwork. For example, Verbeek and Nijman (1996) and Stewart (2006)point out that a single-shot estimation is not able to control forsimultaneity. Similarly, Biørn (2004) points out that a single-shotestimation is unable to handle the selection issue. Furthermore, asingle-shot estimation is also computationally very intensive, giventhe size of our group of data (2.3 million observations). Hence,we utilize the computationally less stringent two-step estimatorfor our model while explicitly incorporating all economic issuesinvolved in our context.

Page 8: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1678 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

only and the second for the remaining time periods.We also include a lagged dependent variable. Thisvariable also allows us to control for state dependence.

There are two forms of dynamics in consumer deci-sions that can be modeled in a reduced form modellike ours—(i) habit persistence due to switching costsand (ii) variety-seeking behavior that has been iden-tified in prior work (Osborne 2007). In our setting,although many users may have prior experience withPC-based Internet browsing, they may be relativelynew to mobile-phone-based Internet browsing. There-fore, some users may have switching costs. Further-more, some consumers may be variety seeking. Ourdata confirm that users with higher levels of mobil-ity tend to engage more in mobile Internet activities.Hence, we can expect this kind of switching behavioracross users in mobile Internet settings. Therefore, byincluding a lagged dependent variable in the selectionequation, we can examine whether there are switch-ing costs (whether the sign of the lagged dependentvariable is positive and statistically significant) andany variety-seeking behavior (whether the sign of thelagged dependent variable is negative and statisticallysignificant).

An orthogonality problem may arise in our model.Because our selection equation is based on an REmodel, the estimator will be inconsistent if the unob-served, user-specific, time-invariant factor is corre-lated with the regressors therein. Hence, we followthe Mundlak (1978) and Zabel (1992) approachesand add the mean values of time-varying regressors(in our case, the mean value of session initiation bysocial network) to the selection equation. Notationsand variable descriptions are provided in Table A.1(see Appendix A).

Specifically, we specify that user i decides whetherto initiate mobile Internet sessions using an indicatorfunction (i.e., 1 = yes and 0 = no). We specify a modelfor the initial period (t = 1) as follows:

Session∗

i11 = �0 +�1Agei +�2Age2i +�3Sexi

+�4Handset Agei +�5tz−i1 t +ui (1)

Sessioni11 = 14Session∗

i11 > 050

For the remaining periods (t ≥ 2), we specify amodel as follows:

Session∗

i1t

=�0 +�1Sessioni1t−1 +�2Social Network Sessioni1t−1

+�3Agei+�4Age2i +�5Sexi

+�6Social Network Sessioni+�i+�t (2)

+�7z−i1t+�i1t1

Sessioni1t =14Session∗

i1t>051

where �i is a user-specific random coefficient, �t is atime-period dummy, z−i1t is a mean mobile Internetsession initiation of all other users in user i’s billingzip code, and �i1t is an error term.

If the initial conditions correlate with the unob-served, user-specific, time-invariant factor, as wouldbe expected in most situations, our estimators will beinconsistent. Because the ui in Equation (1) does cor-relate with �i in Equation (2), but does not correlatewith �i1t for t=21310001T , we specify ui as follows:6

ui =��i+�i111 (3)

where � is an initial condition parameter. We thenestimate the RE dynamic probit model, using maxi-mum likelihood estimation methods based on Stewart(2006, 2007).7

4.2. The Main Equations: Content Generation andContent Usage Frequencies

We specify a fixed effect (FE) model for content gen-eration and usage equations.8 To account for sam-ple selection bias, we insert a selection correctionterm into the content generation and content usageequations. To incorporate temporal interdependence,we include a lagged content download frequencyvariable in a content upload equation and a laggedcontent upload frequency variable in a content down-load equation. We include each one of four mobilitymetrics in our main equations in separate estima-tions to demonstrate robustness to both the mean andthe variance of the mobility metric, at both local andnational levels. To account for the individual-levelsocial network effect, as before, we include laggedsocial network variables to alleviate the endogene-ity bias that can arise when using the social networkvariable.9 We include the number of voice calls as acontrol variable in the content generation and usageequations to control for user inherent propensity to

6 We check serial autocorrelation in the error term and find the esti-mate for serial autocorrelation is not statistically significant (p-valueis 0.627).7 Given that there are several user-specific, time-invariant variablesthat may affect a user’s mobile Internet session initiation, we usean RE dynamic probit model for the selection equation.8 Verbeek and Nijman (1996) demonstrate that a fixed effect estima-tor in the main equations in their two-step approach is more robustto selection biases than a random effects estimator. Moreover, theyalso show that the conditions for the consistency of a fixed effectestimator are weaker than those for a consistent random effectsestimator.9 We also specify an alternative model, allowing for contempo-raneous social network effects by using an instrumental variableapproach for each equation, similar to Iyengar et al. (2011). Theresult indicates that the contemporaneous social network effect ispositive and statistically significant, whereas other estimates quali-tatively remain the same. Hence, we find no evidence of misspeci-fication bias from using a lagged social network variable.

Page 9: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1679

make calls. Furthermore, we include control vari-ables, including user-specific dummies, time-perioddummies, and time-period- and location-specific fixedeffects at the user level, to control for endogeneityfrom using a social network variable as a regressor.

We take the logarithm on variables to controlfor their right-skewed nature (i.e., there are someheavy uploaders and heavy downloaders as seen inTable 1). We implement a 3SLS estimation on thefirst-differenced equations of log-transformed con-tent generation and content usage frequencies. Thesimultaneous estimation method allows for efficiencygain compared to single-equation estimation methodsby taking into account the cross-equation error corre-lation.10 Specifically, content generation frequency andusage frequency equations are specified as follows fort=21310001T :11

log4Uploadi1t5

=�0 +�14Downloadi1t−15+�2 log4Mobilityi1t5

+�3 log4Social Network Uploadi1t−15

+�4 log4Voicei1t5+�5Selectioni1t+�6 log4g−i1t5

+�7 log4Social Network Uploadi5+�i+�t+vi1t1 (4)

log4Downloadi1t5

=�0 +�1 log4Uploadi1t−15+�2 log4Mobilityi1t5

+�3 log4Social Network Downloadi1t−15

+�4 log4Voicei1t5+�5Selectioni1t+�6 log4h−i1t5

+�7 log4Social Network Downloadi5+�i+�t+�i1t1 (5)

where Social Network Activityi1t−1 =∑

m∈nt−14i54wi1m1t−1 ·

Activitym1t−15; wi1m1t−1 is the normalized number ofcalls user i made to user m in week t−1; Activity iseither Upload or Download; and g−i1t and h−i1t are meanuploading and downloading frequencies of all otherusers in user i’s billing zip code, respectively. In addi-tion, �i and �i are user-specific dummies, �t and �t

10 When the disturbance covariance matrix is not known, general-ized least squares is inefficient compared to full information max-imum likelihood and three-stage least-squares estimation (Lahiriand Schmidt 1978).11 Note that we cannot include a lagged dependent variable as aregressor in Equations (4) and (5) because the lagged dependentvariable in each equation correlate with the disturbance term afterfirst-differencing transformation, leading to inconsistency in theestimates. In addition, a series of control variables helps us capturethe impact of any potential omitted variable (i.e., lagged depen-dent variable). These control variables include (i) time-period fixedeffects, (ii) time and location-specific mean content activity vari-ables at the user level, and (iii) mean content activity by social net-work neighbors at the user level, which can mitigate this bias. Thatsaid, our main results are robust even when we include a laggeddependent variable and estimate that variable using GMM-baseddynamic panel data models (see Appendix C).

are time-period dummies, and vi1t and �i1t are user-and time-specific error terms.

Furthermore, recall that we specified an FE modelfor equations of content generation and usage fre-quencies. It is well known that estimation of an FEmodel with a lagged endogenous variable is subjectto potential finite-sample bias (Nerlove 1967, Nickell1981). Our analysis may suffer from this bias becausethe number of observations per user is 13 for theentire sample but only 5 for the subsample. Hence,we take a first-differencing transformation on eachvariable in the model to alleviate the potential biasfrom the fixed effect model (Wooldridge 2002) anddifference out both observed and unobserved user-specific, time-invariant variables (e.g., age, gender,job characteristics, prior Internet experience, etc.).12

Although we find there exists no serial correlationin the error term from the selection equation and inthe error term from each of the main equations sep-arately, we control for the potential serial correlationin the main simultaneous equations of content gener-ation and usage by using the robust variance matrix(Wooldridge 2002).13 The robust variance matrix esti-mator (Arellano 1987) is valid in the presence of serialcorrelation in error terms in Equations (4) and (5)(Wooldridge 2002).

To be specific, the first-differenced content genera-tion frequency and usage frequency equations that weestimate are specified as follows, for t=21310001T :

ãlog4Uploadi1t5

=�1ãlog4Downloadi1t−15+�2ãlog4Mobilityi1t5

+�3ãlog4Social Network Uploadi1t−15

+�4ãlog4Voicei1t5+�5ãSelectioni1t+�6ãlog4g−i1t5

+ã�t+ãvi1t1 (6)

ãlog4Downloadi1t5

=+�1ãlog4Uploadi1t−15+�2ãlog4Mobilityi1t5

+�3ãlog4Social Network Downloadi1t−15

+�4ãlog4Voicei1t5+ã�5Selectioni1t+�6ãlog4h−i1t5

+ã�t+ã�i1t0 (7)

12 This approach is similar to Verbeek’s (1990), where Verbeek takesthe within transformation to eliminate the incidental parametersand maximizes the likelihood of the transformed data. He alsoshows that the corresponding estimator is consistent, even whenonly a few time series observations are available.13 We include the time-based control variables to control for under-lying time trends that help eliminate serial correlation in our data.That said, we estimate our main equations separately, using thegeneralized estimating equations method and find that the esti-mated AR(1) coefficients in the upload equation and the downloadequations are 0.006 and 0.007, respectively. Obviously, with suchsmall AR(1) values, the other estimates qualitatively remain thesame as in the result for our main model.

Page 10: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1680 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

4.3. IdentificationWe discuss two issues in the identification of ourmodel: (1) identifying the selection equations and themain equations of content generation and usage fre-quencies, and (2) identifying the social network effect.

4.3.1. Identification of the Selection Equationsand the Main Equations. Our identification strategyfor the selection equations and the main equationsincludes normalization of parameters for binary deci-sion variables, exclusion restrictions, and the use ofinstrument variables as part of our estimation process.

In the selection equations, because a user’s mobileInternet session initiation is a binary choice, we needlocation and scale normalization on the latent depen-dent variable, Session∗

i1t , for identification. For loca-tion normalization, we set the user i’s utility of notengaging in any mobile Internet sessions in week t asSession∗

i1t =0. For scale normalization, we set the vari-ance of the unobserved, user-specific, time-specificeffect, �2

�, to 1. In addition, sample selection issuesrequire our explicitly estimating the selection equa-tion, using both time invariant and time varyingregressors (i.e., age, sex, mobile Internet session ini-tiation by social network). We call these variables Z.Because a selection correction term is a nonlinearfunction of the variables included in the selectionequations, our main equations of content generationand usage frequencies with regressors X are iden-tified due to this nonlinearity, even if Z=X. How-ever, this nonlinearity arises from the assumption ofnormality in the probit model. Thus, we check anexclusion restriction by including variables in Z thatare not included in X, which makes the identifica-tion cleaner (Puhani 2000). The exclusion restrictionis satisfied because we include some time-invariantvariables (e.g., age and gender) as well as a time-varying variable (e.g., mobile Internet session initia-tion by social network) only in the selection equation,but exclude these variables in the main equations.

In the main equations, in the absence of better data,we use time-series-based instruments for identifica-tion. Content upload and download frequency vari-ables in a given week are taken to be endogenous tothe system of equations, whereas all other variablesin the system are treated as exogenous to the systemor predetermined. For example, in the content uploadfrequency equation, variables like lagged downloadfrequency, geographical mobility, and lagged socialnetwork (see §4.3.2) are exogenous or predetermined.This is true for the following reasons.

First, geographical mobility is exogenous because itis very unlikely that one’s mobility is determined byone’s propensity to generate and use content. Instead,it is far more likely that a user’s propensity to generateand use content is driven by the extent of their geo-graphical mobility. Second, we assume that only past

content usage level affects the current content genera-tion level of a user and similarly only past content gen-eration usage level affects the current content usagelevel of a user. Hence, the download frequency vari-able at time t−1 is predetermined because it cannotbe determined at time t. This is true because the errorterm at time t in Equation (4) is uncorrelated with cur-rent and lagged values of the predetermined variable(i.e., E6log4Downloadi1t−15vi1t7=0) but may be corre-lated with future values (i.e., E6log4Downloadi1t5vi1t7 6=0). This claim holds true because we found no serialautocorrelation in the error terms after controllingfor time trends (see §4.3.2). For the same reason, logupload at t−1 in the Equation (5) is a predeterminedvariable. This alleviates the concern of endogeneityfrom simultaneity and thus focuses on the dynamicsin the interdependence between content generationand usage of a user. Third, the lagged social net-work variable is predetermined at time t, because wecontrol for other sources for endogeneity from usinga social network variable as a regressor (see §4.3.2).These are valid instruments for the endogenous vari-able because they are uncorrelated with the unobserv-able error term. A similar set of arguments applies tothe content download frequency equation.

As a robustness check, we also include a nontime-series-based variable as an instrument. In partic-ular, we include the handset age variable as anadditional instrument in the content generation equa-tion, excluding it from the content usage equation.The key assumption behind this argument is that theage of the handset is more likely to impede usersfrom uploading multimedia content than download-ing content. Users can download multimedia content,irrespective of how advanced their handset featuresare, such as the number of pixels in their mobilecamera, the technical sophistication of the software,and applications installed in the handset—the kindsof features that are required to experience multime-dia content downloaded from the Web. In contrast,the lack of handset functionality and advanced fea-tures is more likely to prevent users from creatingand uploading content because older phones are notequipped with advanced digital cameras and audio/video/photo editing applications—the kinds of fea-tures users need to upload content on the Internet.The qualitative nature of all these results remains thesame, however, with the inclusion of this variable asan instrument.

Furthermore, we examine whether both the neces-sary order condition and sufficient rank condition aresatisfied for our main equations of content genera-tion and usage frequencies. The order condition is metbecause each equation excludes exogenous or prede-termined variables (i.e., a mobility variable, a laggedsocial network variable, a lagged temporal interde-pendent variable, mean mobile Internet activity of

Page 11: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1681

all other users in user i’s billing zip code area, etc.),although it has no right-hand-side endogenous vari-able. Furthermore, we check the rank condition usingBaum’s (2007) Stata code to find that the rank condi-tion for each main equation is satisfied.

4.3.2. Identification of the Social Network Effect.To address the endogeneity issue of the social net-work variable, we adopt the identification strategyand modeling approach in accordance with the priorwork that has studied the impact of the social net-work effect on user behavior (Manski 1993, Hartmanet al. 2008, Nair et al. 2010). We distinguish causal-ity from correlation by separating causal effects fromeach of three sources of correlation: (1) endogenousgroup formation, (2) correlated unobservables, and(3) simultaneity. In doing so, we incorporate severaladditional variables. We explain these in detail below.

First, regarding the endogenous group formation,the observed correlation in the behavior of an individ-ual and other individuals in the social network couldarise from omitted individual characteristics that cor-relate within the group. For example, users can chooseto call users with similar tastes in multimedia contentgeneration and usage. They might virtually meet atsocial networking sites or physically meet in the sameoffice/home neighborhood, and such meetings canlead to the formation of groups endogenously. Thiswill produce an upward bias in the social networkeffects. Consistent with the literature, we include auser-specific random effect in the selection equation(i.e., Equation (3)) and a user-specific fixed effect inthe main equations (i.e., Equations (4) and (5)) toaccount for this issue.

Second, regarding the correlated unobservables,some unobservables could drive the behavior of anindividual and other individuals in one’s social net-work. For example, marketing promotions targetedto users or exogenous incidents, such as celebritysightings and festive events, can drive individualsin a given social network to upload and downloadphotos/videos to social networking sites. If uncor-rected, this effect could be mistaken for a social net-work effect. In other words, any spatially correlated,location-specific shocks to users in the same groupcan lead to such a bias, and we need to account for it.

In addition to adding the user-specific effect, thereare three additional controls for correlated unobserv-ables. The first is to include time-period fixed effects.These can control for common factors or shocks toall individuals at a given time. Examples include themobile carrier’s nationwide mobile marketing cam-paigns or promotions, such as free trial downloads ofcontent. The second is location fixed effects (i.e., zipcode dummies). These can control for time-invariant,spatially correlated unobservables. For example, usersin urban areas may be more tech savvy and more

prone to engaging in mobile Internet activities com-pared to users in rural areas. The third is to includetime and location effects to control for unobserv-ables that correlate at the level of zip code and time.We apply both billing-zip-code-based and calling-zip-code-based content activities variables.14 For exam-ple, celebrity sightings, concerts, family weddings,social events, festivals, unusual street incidents, etc.,may give people the opportunities to capture andshare such moments with friends and families viatheir mobile devices. Hence, we include the time- andlocation-specific mean session initiation frequency ofall other users in the same zip code area of user i,denoted by Session−i1t , in the selection equation andthe time- and location-specific mean content uploadand download frequencies of all other users in the zipcode area of user i, denoted by g−i1t and h−i1t in themain equations, respectively.

Our econometric specification is able to address allthree issues required to distinguish causality fromcorrelation in the social network effect—endogenousgroup formation, correlated unobservables, andsimultaneity. Hence, our empirical estimates dodemonstrate a causal effect of the social network onuser content generation and usage on the mobileInternet. However, we would like to be cautiousin our interpretation. In the absence of controlledvariation using natural or field experiments duringour sampling period, we interpret the relationshipbetween social network behavior and user behaviorin our paper as being one that establishes an upperbound on the causal effect of the social network.

5. ResultsIn this section, we discuss our results and briefly sum-marize the key findings from the selection equation.They show that positive state dependence exists, sug-gesting that some users incur switching costs in ourcontext and also that there is a positive social network

14 Each user’s billing-zip-code-based content activity variables cap-ture the demand changes around that user only in the fixed locationover time. To account for spatial change in demand, we includedthree additional location-varying and time-varying average contentactivity variables based on user i’s top three places at which heplaced calls (i.e., calling-zip-code-based content activity variables).The rationale for the inclusion of these variables is that user i andhis network neighbors might travel to the same zip code area ina given week, and hence they could be influenced by an exoge-nous demand shock arising from that location. Examples includecelebrity appearances, concerts, some unusual street incidents, etc.Our results confirm that with the inclusion of an additional three(location- and time-varying content activity) control variables, thelagged network variable estimates remain qualitatively the sameas before. Therefore, these results suggest that the impact of corre-lated unobservables from colocation of users on the social networkvariable is not a major issue.

Page 12: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1682 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

effect in initiating mobile Internet sessions. Further-more, user mobile Internet initiation behavior greatlyvaries by age, implying an inverted U-shaped rela-tionship between age and mobile media usage witha peak at around approximately 21 years old. Theseestimates appear in Appendix B.

In §5.1, we present the 3SLS estimation resultsof content generation and usage equations using a13-week sample. These results shed light on tempo-ral interdependence and to some extent on the socialnetwork effect.15 In §5.2, we present the estimationresults using a 5-week sample that contains data onthe communication strength between social networkneighbors and data on the geographical mobility ofusers.16 In §5.3, we look at the cohort analysis resultsto gain additional insights of results through subsam-ple analyses. In §5.4, we discuss a series of robustnesschecks that demonstrate the robustness of our mainresults. In §5.5, we discuss the economic implicationsof our results.

5.1. Results Using the Total SampleRecall that we can observe the content upload anddownload frequency of users only if they initiate theirmobile Internet sessions. Results show that our esti-mates for a selection correction term are indeed posi-tive and statistically significant in content generationand usage equations (the coefficient estimates are0.0216 and 0.7267, respectively). This suggests thatpeople who more frequently initiate mobile Inter-net sessions are more likely to engage in upload-ing and downloading content as opposed to thosewho less frequently initiate mobile Internet sessions.Moreover, results from the selection equations (seeAppendix B for details) show that males and youngerusers who are prone to uploading or downloadingcontent through their mobile phones more frequentlyinitiate mobile Internet sessions. These results confirmthat controlling for sample selection bias is crucial inour setting.

The main results from the simultaneous equationsof content upload and download using the full sam-ple are given in Table 2. We find that there is a

15 Recall that we do not observe voice call records during the entire13-week period, but observe them only over a 5-week period. Thus,to measure the amount of content generation and content usage ofnetwork neighbors of a given user at a given time period, we defineuser i’s network neighbors based on 5-week voice call records andtreat them as fixed throughout the 13-week period; that is, thegroup of network neighbors of user i is denoted as n4i5 in the13-week sample, rather than nt4i5.16 We implemented all models here without the social network vari-able to alleviate any remaining concerns about endogeneity, evenfrom the use of a lagged social network variable. We find qualita-tively the same result between with and without the lagged net-work variable. Details are available upon request.

Table 2 3SLS Estimation Results on Content Frequency (Total Sample)

Dependent variable Explanatory variable Coefficient

Log Upload Log Download Frequency (t−1) −000091Frequency (t) 40000055∗∗∗

Log Upload Frequency by NN (t−1) 00011540000425∗∗∗

Selection (t) 00021640000075∗∗∗

Log Download Log Upload Frequency (t−1) −000178Frequency (t) 40000735∗∗∗

Log Download Frequency by NN (t−1) 00014340000475∗∗∗

Selection (t) 00726740000265∗∗∗

Notes. NN refers to network neighbors. Estimates for time-period fixedeffects and mean uploading and downloading frequency of all other users inuser i ’s billing zip code effects are not reported for brevity.

∗∗∗Significant at 0.01.

negative and statistically significant temporal interde-pendence between content generation and usage. Thisfinding implies that an increase in content usage in aprevious period does associate with a decrease in con-tent generation in the current period and vice versa(the coefficient estimates are −0.0091 and −0.0178,respectively). This finding provides evidence that theresource constraint (e.g., time and money constraint)binds at least for some people.

We find positive and statistically significant socialnetwork effects in content generation and usage equa-tions (the coefficient estimates are 0.0115 and 0.0143,respectively). We also find statistically significant esti-mates for our control variables like time-period dum-mies, mean uploading and downloading frequenciesof all other users in user i’s billing zip code vari-ables, etc.

We discuss the impact of each effect using theirmarginal effects.17 For example, a one-standard-deviation increase in the frequency of content down-loading in the previous period decreases the frequencyof content uploading in the current period by 3.6%when evaluated at the mean. Similarly, a one-standard-deviation increase in the frequency of con-tent uploading in the previous period decreases thefrequency of content downloading in the currentperiod by 5.1%. Thus, the marginal effect of temporalinterdependence is asymmetric and stronger in con-tent usage than it is in content generation.

In addition, the marginal effect of social networkeffect is larger in content usage than in content gen-eration (11.0% and 1.5%, respectively). Recall that thisresult of lagged social network effect does not incor-porate the communication strength between users

17 Given the log–log specification in Equations (4) and (5), the coef-ficients represent elasticities. In addition to these elasticities, weinterpret the coefficients using marginal effects as well as economicimplications (see §5.5).

Page 13: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1683

when imputing the structure of the social network fora given user. When we incorporate this information,we can obtain a more accurate measure of the socialnetwork effect on user behavior. This finding is dis-cussed below in §5.2.

5.2. Results Using the Communication Strengthand Geographical Mobility Subsample

It is possible that a user’s content generation propen-sity more strongly associates with that of his fam-ily, close friends, or colleagues (whom the user callsmore frequently or speaks to for a longer duration),rather than that of acquaintances (whom the user callsless frequently or speaks to for a shorter duration).To incorporate the dynamic, weighted social networkeffect on user behavior, we next present results from a3SLS estimation using a 5-week sample that has time-varying data on the extent of communication strengthbetween users and their network neighbors. In addi-tion, the 5-week sample includes the four differentmobility metrics of users described earlier in §3.

We conduct our analyses on both frequency-basedand duration-based models in which we incorpo-rate call frequencies and call durations, respectively,to determine the magnitude of the communicationstrength. Our results are robust for the use of eitherfactor (call frequencies and call duration) as a weightfor computing the strength of social network effect.Our results are also robust for the exclusion of thenumber of voice calls as a control.

The main results are given in Table 3. As before,the estimates for selection correction terms are pos-itive and statistically significant in both equations,reassuring that controlling for sample selection biasis crucial in our setting. Our results show that thereexists a negative and statistically significant interde-pendence between content usage in the current periodand content generation in the previous period. Asan example, the coefficient estimates are −0.0098 and−0.0239, respectively, in a model with a mean localmobility variable. This result is consistent, irrespec-tive of the use of different mobility variables. Asbefore, this finding lends support to the claim thatthe resource constraint (e.g., time and money con-straint) binds at least for some. These effects are alsoasymmetric. The marginal effect of temporal interde-pendence is stronger in content downloading than incontent uploading (−6.8% and −4.3%, respectively,in a model with a mean local mobility variable)—namely, the negative impact of a previous period’scontent uploading on the current period’s contentdownloading propensity is higher than the opposite.

We find that our mobility metrics positively asso-ciate with content generation and usage activitiesof users. For example, mean local mobility of auser positively associates with content generation and

usage (the coefficient estimates are 0.0087 and 0.0193,respectively). We find that the marginal effect of meanlocal mobility on content downloading is higher thanon content uploading (1.2% and 0.5%, respectively)when evaluated at the mean, whereas the marginaleffect of mean national mobility on content down-loading and content uploading is similar (0.5% and0.3%, respectively). In addition, local mobility disper-sion positively associates with content generation andusage (the coefficient estimates are 0.0057 and 0.0328,respectively). We also find that the marginal effecton content downloading is much higher than it ison content uploading (3.3% and 0.6%, respectively).Similarly, the marginal effect of national mobility dis-persion on content downloading is higher than theeffect on content uploading (1.0% and 0.3%, respec-tively). These results suggest that users more fre-quently engage in content downloading than contentuploading when traveling. Furthermore, in general,the variance of user travel patterns (mobility disper-sion) has a stronger impact on their mobile Internetactivities than on the mean (mean mobility).

The relationship between content generation andusage behavior of users and their social networksis positive and statistically significant (for example,the coefficient estimates are 0.0162 and 0.0348, respec-tively, in the model applying the mean local mobil-ity variable). This result is consistent, irrespective ofthe use of different mobility variables. We find thatthe marginal effect of the social network on contentdownloading is much higher than it is on contentuploading (26.8% and 1.4%, respectively). In addi-tion, we find statistically significant estimates forour control variables like time-period dummies andmean uploading and downloading frequencies of allother users.

5.3. Cohort Analysis ResultsWe implement four sets of cohort analyses. First, auseful test for the central notion of economic behav-ior under resource constraints espoused in this paperis to examine whether users who appear closer toa binding constraint on resources show a stronger(negative) temporal interdependence between contentgeneration and usage. Toward this end, we dividethe sample based on the age of the user into twocohorts: “younger” users (below the age of 22) and“older” users (above the age of 22). Results are robustwith respect to this age cutoff point. The assump-tion is that younger users are more likely to facemonetary constraints compared to older ones becauseof a lower amount of discretionary income. Hence,we would expect a greater level of negative tempo-ral interdependence between content generation andusage behavior for such users. 3SLS estimation resultsshow that this assumption holds true. For example,the marginal effect of temporal interdependence in

Page 14: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1684 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

Table 3 3SLS Estimation Results on Content Frequency

Dependent variable Explanatory variable Coefficient

Log Upload Frequency (t) Log Download Freq. (t−1) −000098 −000098 −000097 −00009740000055∗∗∗ 40000055∗∗∗ 40000055∗∗∗ 40000055∗∗∗

Log Mean Local Mobility (t) 00008740000145∗∗∗

Log Mean National Mobility (t) 00016340000255∗∗∗

Log Local Mobility Dispersion (t) 00005740000165∗∗∗

Log National Mobility Dispersion (t) 00005740000125∗∗∗

Log Number of Voice Calls (t) 000010 000013 000018 00001040000075 40000075∗ 40000075∗∗∗ 40000075

Log Upload Freq. by NN (t−1) 000162 000163 000163 00016340000545∗∗∗ 40000545∗∗∗ 40000545∗∗∗ 40000545∗∗∗

Selection (t) 000118 000116 000126 00013040000045∗∗∗ 40000045∗∗∗ 40000045∗∗∗ 40000045∗∗∗

Log Download Frequency (t) Log Upload Freq. (t−1) −000239 −000238 −000234 −00023440000925∗∗∗ 40000925∗∗∗ 40000915∗∗∗ 40000915∗∗∗

Log Mean Local Mobility (t) 00019340000575∗∗∗

Log Mean National Mobility (t) 00021840000985∗∗

Log Local Mobility Dispersion (t) 00032840000655∗∗∗

Log National Mobility Dispersion (t) 00021840000475∗∗∗

Log Number of Voice Calls (t) 000385 000401 000377 00035640000275∗∗∗ 40000265∗∗∗ 40000265∗∗∗ 40000285∗∗∗

Log Download Freq. by NN (t−1) 000348 000348 000348 00034840000575∗∗∗ 40000575∗∗∗ 40000575∗∗∗ 40000575∗∗∗

Selection (t) 003633 003635 003658 00366640000145∗∗∗ 40000145∗∗∗ 40000145∗∗∗ 40000145∗∗∗

Notes. We included each one of the two mean mobility metrics and the two mobility dispersion metrics in our main equations at both the local and nationallevels. NN refers to network neighbors. Estimates for time-period fixed effects and mean uploading and downloading frequency of all other users in user i ’sbilling zip code effects are not reported for brevity.

∗Significant at 0.1; ∗∗significant at 0.05; ∗∗∗significant at 0.01.

content downloading (content uploading) is −8.7%(−4.1%) for the younger user cohort, whereas it is−2.7% (−2.0%) for the older user cohort. This findingimplies that the resource constraint binds more tightlyon younger users than on older users.

Second, we divide the sample based on the locationof the user into two cohorts: “urban” users (who live insix major cities in South Korea) and “suburban” users(who live in other areas in South Korea). The premiseis that urban users are more likely to have better 3Gbroadband coverage as well as travel-related discre-tionary time via public transportation (i.e., subways orbuses) compared to suburban users. Hence, we wouldexpect a higher impact of mean mobility on urban usercontent generation and usage behavior. The 3SLS esti-mation results show that this assumption also holdstrue. For example, the marginal effect of mean localmobility on content downloading (content uploading)is 2.2% (1.8%) in the urban user cohort, whereas it is1.8% (0.3%) in the suburban user cohort.

Third, we implement a subsample analysis byexcluding users who either upload or download

disproportionately to alleviate potential bias fromincluding these outliers (i.e., user-generated-contentjunkies or free riders). The subsample consists ofthose users whose upload frequency is in a similarrange to their download frequency. To be specific, weinclude a user into our subsample if the absolute dif-ference between download frequency and upload fre-quency for the same user is less than a given cutoffvalue (e.g., 3, 5, or 10). We thus find qualitatively thesame result as for the main result.

Last, we run analyses on a subsample consistingof only those users who engage in both generationand usage activities in the same week at least once inthe sample to mitigate the potential bias from includ-ing users who either upload or download but do notengage in both activities. This subsample constitutes15.7% of the total sample. We find support for thenegative temporal interdependence between contentuploading and downloading behavior. The results forthe geographical mobility effect and the social net-work effect are qualitatively the same as in our mainresults.

Page 15: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1685

5.4. Robustness ChecksWe did implement a series of robustness checks.Because of the evidence of positive state dependencefrom the selection equation results, we conduct teststo check the robustness of the results by estimatingthe main equations separately with a lagged depen-dent variable to control for the state dependence,using GMM-based dynamic panel data model. We findthat the results are qualitatively the same as our mainresults (for details, see Table C.1 in Appendix C).

To capture unobserved heterogeneity among theusers, we also estimate a mixed effect model wherewe include a random coefficient for a constant term(i.e., �0 in Equation (4) and �0 in Equation (5)). Fur-thermore, strictly speaking, user content upload anddownload variables in our sample take on nonnega-tive integer values. We thus use various types of lin-ear models (i.e., 3SLS and GMM dynamic panel datamodels). However, for count data, linear models dohave shortcomings. Hence, one could argue that weshould examine our questions using count data mod-els. However, count data models with fixed effectsare known to suffer from the “incidental parametersproblem,” except for the Poisson model. Therefore,we did implement a Poisson fixed effect model inwhich the incidental parameter problem is not a prob-lem (Lancaster 2000, Greene 2007). We find that theresults are qualitatively the same as our main resultsfrom 3SLS estimation (for details, see Table C.2 inAppendix C).

Because the total amount of money and timeresources of a user can potentially vary every week,an alternative model would be to model the user’sshare of content uploads with respect to the user’stotal amount of content uploads and downloads (con-tent share model ). Besides the frequency of contentuploads and downloads, costs that users will incuralso depend on how many bytes a user uploads anddownloads. Toward this end, we use the amount ofbytes uploaded and downloaded for each user insteadof frequency of uploading and downloading (contentsize model ). We find that these results remain the sameas for the 3SLS estimation result for our main model(for details, see Tables C.3 and C.4 in Appendix C).18

5.5. Economic ImplicationsWe can see clearly that elasticity estimates for interde-pendence parameters are small. To understand theireconomic importance in the context of this industry,it is not that useful to study the impact of a smallpercentage change in the variables. In particular, such

18 We also tried alternative specifications for social network effects—a lagged cumulative effect and lagged binary indicator. Neither ledto any change in the qualitative nature of the results. Details areavailable upon request.

is the case because the mean of several variables issmall, whereas the standard deviation is proportion-ately much higher. Instead, it is more meaningful toevaluate the impact of different percentile changesin content generation and usage frequency variablesfor different sizes of the user base and impute theirmonetary value.

To this end, we look at the economic impact on thetop 25th percentile user group based on this group’soverall content upload and download frequencies.19

This user group represents 2,500,000 users of thecompany contained in our data (the company hasabout 10 million users). We assume that an exoge-nous shock shifts the content upload and downloadactivity level at time 0 from the top 25th to the top10th percentile.20 In this setting, users are charged bythe amount of traffic transmitted during their contentgeneration, and usage activities are charged at $1.5per one megabyte of data transmission. Based on thetop 25th percentile of the user group data (2,500,000users), a one-time content upload leads to an aver-age of 0.14 megabytes of data transmission, and auser uploads about 0.49 times on average during aweek. As noted above, the elasticity of current contentupload frequency with respect to the lagged contentdownload frequency is −0.0098. Suppose there is anincrease in the usage frequency of this group suchthat that usage takes the group from the top 25th per-centile (48.8 times/week) to the top 10th percentile(82 times/week) in week 0.

The “immediate” impact from an increase in thedownload frequency in week 1 is then as follows:elasticity×change rate in usage×average number of bytesuploaded per instance×price per byte×number of users×average number of times downloaded per week. This pro-duces an amount of −$170,814. One week’s rev-enue gain will be equal to the following week’s lossbecause a higher frequency of downloads in week twill lead to a lower frequency of uploads in week t+1and vice versa.

To account for this dynamic effect, we analyze“long-term” effects by using the impulse responsefunction of a shock. The impulse response functionis widely used to trace the impact of a shock ona single endogenous variable being introduced intothe coupled system (Dekimpe and Hanssens 1995,

19 We also look at the economic impact on other user groups—top1st percentile user group and top 10th percentile user group. Detailsare available upon request.20 We examine the economic impact from other percentile changesin mobile Internet content generation and usage frequencies as wellbased on the following amounts: (1) from the top 90th to the top10th percentile; (2) from the top 75th to the top 25th percentile; and(3) from the top 50th to the top 25th percentile. The main implica-tions remain consistent, irrespective of the percentiles. Details areavailable upon request. Furthermore, we assume that the shock isexogenous and does not alter the parameter estimates of the model.

Page 16: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1686 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

Vanhonacker et al. 2000). In a similar way, using thecorresponding values for the variables in the aboveequation, we compute the impact in weeks 2, 3, andso on. This leads to an annualized long-term effect of−$6,761,035. This finding implies that the companycan incur a loss of approximately $6.76 million intraffic revenues a year from the top 25th percentileuser group because of the negative interdependencebetween content generation and usage. This $6.76 mil-lion constitutes more than 2.6% of the company’sgross annual revenue accrued from the top 25th per-centile user group.21

Our discussion of “long-term” effect is based onshocks that do not incur any monetary expenses bythe company. For example, shocks such as celebrityappearances, concerts, family weddings, social events,festivals, unusual street incidents, etc., may providepeople with opportunities to capture and share suchmoments with friends and families using their hand-held mobile devices. Other shocks explicitly gener-ated by the firm may involve that mobile carrier’smarketing actions and policy changes, which can alsostimulate changes in user behavior. Examples includediscounted rate plans on Internet usage, discounts onhandsets conditional on certain usage levels, targetedcoupons promoting Internet usage, etc. However, wedo not have data on these. Hence, the monetary effectof a shock that is based on the parameter estimatesby the current model would provide only an upperbound of the effect.

6. Discussion and ImplicationsMobile-Internet-based content services constitute oneof the fastest-growing applications on the Web today.However, very little is known about the interdepen-dence between content generation and content usagebehavior of users. Nor do we know much about theother factors that can drive user behavior on themobile Internet. To examine these factors, we presenta generalized empirical framework of user behaviorin the mobile Internet space. We analyze that frame-work using an unprecedented user-level data set thatcontains rich data on user demographics, calling pat-terns, calling locations, and social networks.

Our data come from a setting where users enrollin usage-based data pricing. Recently media reportshave pointed out that the major wireless carriers inthe United States, namely, AT&T and Verizon Wire-less, are getting ready to implement a usage-baseddata pricing scheme for data/Internet usage (RethinkWireless 2010). These mobile carriers have profitedfrom low fixed-fee penetration pricing. However,these carriers have seen enormous growth in data

21 Because we use high-frequency data, same-period effects may beminimal in our setting.

traffic, which often outpaces the capacity of their net-works. For example, AT&T has experienced 5,000%growth in data traffic over the past three years, but40% of that traffic is consumed by just 3% of its smart-phone users (Rethink Wireless 2010). A global sur-vey of mobile telecom executives across 50 countriesand six continents conducted by The Economist (2010)revealed that 60% of respondents believe usage-baseddata pricing is the way of the future in mobile dataand content services. Our results based on this usage-based data pricing scheme can provide useful man-agerial insights for companies’ contemplating suchpricing schemes.

The insights from this study also have manage-rial implications. First, the asymmetric, negative tem-poral interdependence between content generationand usage provides mobile phone companies withinsights on how to stimulate content generation andusage. Our results imply that content generationrequires disproportionately more effort and resourcesthan does content usage. This could stem from tech-nical complexity in using content generation anduploading functionalities on their phones, lack ofprior experience in uploading content through mobiledevices, reduced uploading speed due to bandwidthcongestion, etc. There is some evidence of technicaldifficulties’ occurring in user content generation in anonline setting (Stoeckl et al. 2007). Hence, companiescould provide easy-to-use content preprocessing toolsand less complicated content uploading proceduresfor the mobile Web. This asymmetric interdependencecould provide mobile phone companies with newideas regarding differential pricing strategies for datauploading versus data downloading and increase userengagement with mobile media.

Furthermore, the provision of monetary incentivesmight be effective in triggering user content gener-ation behavior. In many ways, content diffusion ina mobile Internet environment is similar to that inpeer-to-peer networks because free riders can createsupply-side constraints. Hence, firms that engage inmobile content provision and advertising could con-sider offering distribution referrals as monetary pay-ments to users who generate and distribute content.Implementation could be delivered through discountsin data transmission charges given to such users.

Second, the asymmetric association of the extent ofuser geographical mobility with their content usagecompared to content generation behavior can providemobile phone companies with additional insights ontargeted mobile advertisements. Previous research onlocation-based mobile advertising shows that usersfind ads distressful when they receive them in homeand work locations (Banerjee and Dholakia 2008). Ourresults suggest that users more frequently engage incontent usage compared to content generation whenthey are traveling. In addition, the variance of user

Page 17: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1687

travel patterns has a stronger impact on mobile Inter-net activities than the mean of user travel patterns.Using this precise mobility information, firms couldengage in dynamically personalized mobile advertise-ments that would be less annoying to users.

Third, our results on the positive effect of thesocial network on user behavior can provide insightsinto how companies can target the right set of usersand influence users in mobile Internet space. Hence,mechanisms designed to update a user instantly onthe frequency with which his network neighbors havedownloaded or uploaded a certain type of content canaffect the incentives of that user to do the same.

Our paper does have limitations. These limita-tions arise mainly from the lack of data. For exam-ple, we do not have information about the spe-cific type of content uploaded or downloaded (e.g.,photo, audio, text, etc.) and the destination web-sites (e.g., social networking sites, mobile portalsites, etc.). Future work could examine this informa-tion. Another area for future research is to studyhow content generated via mobile phones diffusesthrough different kinds of social networks, such as

Appendix A

Table A.1 Notations and Variable Descriptions

Sessioni1t Whether user i started mobile Internet sessions in week t (1=yes, 0=no)Social Network Sessioni1t−1 Weighted average number of mobile Internet session initiations by network neighbors of user i at time t−1; that is,

m∈nt−14i54wi1m1t−1 ·Sessionm1t−15

nt−14i5 User i ’s social network neighbors based on voice call records (i.e., users called by user i) in week t−1Sessionm1t−1 Whether user i ’s social network neighbor m started his/her mobile Internet sessions in week t−1 (1=yes, 0=no)wi1m1t−1 Normalized number of calls user i made to user m in week t−1, that is, wi1m1t is a fraction of voice calls from user i to

user m in week t−1 with respect to the total voice calls originated from user i in week t−1Agei User i ’s ageSexi User i ’s sex (1=male, 0= female)Handset Agei Months elapsed since user i ’s handset was launched in the marketSocial Network Sessioni Time mean mobile Internet session initiation by social network neighbors of user i�i Unobservable, user-specific, time-invariant effect 4�i ∼ IIN401� 2

� 55, where IIN is independent identical normal�t Time-period fixed effectsZ−i1t Mean mobile Internet session initiation of all other users in user i ’s billing zip code�i1t Unobservable, individual-specific, time-specific effect 4�i1t ∼ IIN401� 2

� 55; �2n is set to 1 for normalization, and �i are

independent of �i1t

� Initial conditions parameterUploadi1t Number of times user i uploaded content in week t

Uploadm1t−1 Number of times user i ’s network neighbor m uploaded content in week t−1Downloadi1t Number of times user i downloaded content in week t

Downloadm1t−1 Number of times user i ’s network neighbor m downloaded content in week t−1Mean Mobilityi1t Any one of the following two mean mobility metrics: (1) number of unique zip-code-level locations from where user i

placed calls in week t ; (2) number of unique province-level/state-level locations from where user i placed calls inweek t

Mobility Dispersioni1t Any one of the following two mobility dispersion metrics: (1) fraction of geographical deviation from one’s commonlyvisited places at zip code level for a user in week t ; (2) fraction of geographical deviation from one’s commonly visitedplaces at province/state level for a user in week t

Selectioni1t Selection correction term for user i at time t

g−i1t , h−i1t Mean uploading and downloading frequencies of all other users in user i ’s billing zip code area, respectively�0, �0 Intercepts�i , �i User-specific dummies�t , �t Time-period dummies�i1t , vi1t , �i1t Unobservable, user-specific, time-specific effect; �i1t ∼ IIN401� 2

� 5, vi1t ∼ IIN401� 2n 5, and �i1t ∼ IIN401� 2

� 5

text-messaging-based networks, multimedia content-based networks, and offline location-based spatialnetworks. We hope that this study will generate fur-ther interest in and engagement with the emergingliterature on the economics of user-generated mobilecontent and, more broadly, in the growth in mobilecommerce and the mobile Internet.

AcknowledgmentsThe authors thank Sanjeev Dewan, Wendy Duan, AviGoldfarb, Oliver Yao, and participants at the StatisticalChallenges in Electronic Commerce Research 2009 andInternational Conference on Information Systems 2009 con-ferences for helpful comments. They also thank seminarparticipants at University of Minnesota, New York Uni-versity, University of Texas at Dallas, and University ofMaryland. The authors acknowledge financial support fromNational Science Foundation CAREER Award IIS-0643847,the Korea Research Foundation Grant funded by the KoreanGovernment (KRF-2008-356-B00011), a Google-WPP Mar-keting Research Award, the Wharton Interactive MediaInstitute–Marketing Science Institute, and the NYU SternCenter for Japan–U.S. Business and Economic Studies. Theusual disclaimer applies.

Page 18: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1688 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

Appendix B. Selection Equation ResultsTable B.1 shows the results from the RE dynamic probitmodel. In the second week and thereafter (i.e., t≥2), wefind the estimate for Session (t−1) is positive (0.4602) andstatistically significant, suggesting a positive state depen-dence in initiating mobile Internet sessions. The estimatefor Session by NN (t−1) is positive (0.0020) and statisticallysignificant, suggesting a positive impact of lagged socialnetwork. An interesting aspect is that user behavior greatlyvaries by age, given that the coefficient of Age is posi-tive (0.0960) and statistically significant, and the coefficientof Age2 is negative (−0.0023) and statistically significant,implying an inverted U-shaped relationship between ageand mobile media usage with a peak around approximately21 years old. Results also show that male users are morelikely to engage in mobile Internet content activities thanfemale users. For the first week of observation window (i.e.,t=1), we observe similar results regarding age and genderas above. In addition, the estimate for Handset Age is nega-tive (−0.0007) and statistically significant. This implies thatthe oldness of a 3G mobile handset is negatively associatedwith user propensity to engage in mobile content activi-ties. Furthermore, note that the significant estimate for �suggests that the exogeneity of initial conditions is strongly

Appendix C. Robustness Check Results

Table C.1 GMM-Based Dynamic Panel Data Model Results on Content Frequency

Dependent variable Explanatory variable Coefficient

Log Upload Frequency (t) Log Upload Freq. (t−1) 000744 000748 000735 00073540000545∗∗∗ 40000575∗∗∗ 40000515∗∗∗ 40000515∗∗∗

Log Download Freq. (t−1) −000234 −000234 −000234 −00023440000115∗∗∗ 40000115∗∗∗ 40000115∗∗∗ 40000115∗∗∗

Log Mean Local Mobility (t) 00012740000385∗∗∗

Log Mean National Mobility (t) 00019040000765∗∗∗

Log Local Mobility Dispersion (t) 00006840000405∗

Log National Mobility Dispersion (t) 00003740000195∗

Log Number of Voice Calls (t) 000040 000030 000055 00004540000195∗∗ 40000155∗∗ 40000265∗∗ 40000165∗∗∗

Log Upload Freq. by NN (t−1) 000300 000298 000301 00030240000985∗∗∗ 40000995∗∗∗ 40000985∗∗∗ 40000985∗∗∗

Log Download Frequency (t) Log Download Freq. (t−1) 001093 001109 001051 00105440000545∗∗∗ 40000555∗∗∗ 40000545∗∗∗ 40000575∗∗∗

Log Upload Freq. (t−1) −001840 −001842 −001820 −00182340001575∗∗∗ 40001565∗∗∗ 40001565∗∗∗ 40001565∗∗∗

Log Mean Local Mobility (t) 00014940000185∗∗∗

Log Mean National Mobility (t) 00031440000345∗∗∗

Log Local Mobility Dispersion (t) 00023140000395∗∗∗

Log National Mobility Dispersion (t) 00011240000305∗∗∗

Log Number of Voice Calls (t) 000175 000209 000357 00027140000635∗∗∗ 40000615∗∗∗ 40000775∗∗∗ 40000855∗∗∗

Log Download Freq. by NN (t−1) 000432 000438 000428 00043240001055∗∗∗ 40001075∗∗∗ 40001065∗∗∗ 40001105∗∗∗

Notes. We included each one of the two mean mobility metrics and the two mobility dispersion metrics in our main equations at both the local and nationallevels. NN refers to network neighbors. Estimates for time-period fixed effects and effects of mean uploading and downloading frequency of all other users inuser i ’s billing zip code are not reported for brevity.

∗Significant at 0.1; ∗∗significant at 0.05; ∗∗∗significant at 0.01.

Table B.1 Selection Equation Results

Equation Explanatory variable Coefficient

Session Session 4t−15 (1=yes, 0=no) 004602 (0.0232)∗∗∗

(t≥2) Social Network Session 000020 (0.0008)∗∗

4t−15 (1=yes, 0=no)Age 000960 (0.0217)∗∗∗

Age 2 −000023 (0.0004)∗∗∗

Sex (1=male, 0= female) 001442 (0.0379)∗∗∗

000001 (0.0002)Constant −106300 (0.2898)∗∗∗

Session Age 000548 (0.0219)∗∗

(t=1) Age 2 −000014 (0.0006)∗∗

Sex (1=male, 0= female) 001754 (0.0642)∗∗∗

Handset Age (months) −000007 (0.0003)∗∗∗

Constant −009163 (0.3206)∗∗∗

� 008176 (0.0339)∗∗∗

� 2� 002499 (0.0098)∗∗∗

Notes. Social Network Session is the time of mean mobile Internet initiation by socialnetwork neighbors. Estimates for time-period fixed effects and effects of mean mobileInternet session initiation of all other users in user i ’s billing zip code are not reportedfor brevity.

∗∗Significant at 0.05; ∗∗∗significant at 0.01.

rejected (refer to Equation (3) for �). Finally, based on theseselection equation estimates, we compute a selection cor-rection term, which is later inserted into content generationand usage equations.

Page 19: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1689

Table C.2 Alternative Model Results on Content Frequency (Mixed Effect Model and Poisson Fixed Effect Model)

Coefficient

Mixed effect model (with randomDependent variable Explanatory variable coefficients on constant terms) Poisson fixed effect model

Log Upload Log Upload 006512 006517 006527 006517 000011 000011 000011 000010Frequency (t) Freq. (t−1) 40000225∗∗∗ 40000225∗∗∗ 40000225∗∗∗ 40000225∗∗∗ 40000015∗∗∗ 40000015∗∗∗ 40000015∗∗∗ 40000015∗∗∗

Log Download −000069 −000068 −000068 −000068 −306e−4 −307e−4 −305e−4 −306e−4Freq. (t−1) 40000075∗∗∗ 40000075∗∗∗ 40000075∗∗∗ 40000075∗∗∗ 4600e−55∗∗∗ 4600e−55∗∗∗ 4600e−55∗∗∗ 4600e−55∗∗∗

Log Mean Local 000064 000313Mobility (t) 40000175∗∗∗ 40000445∗∗∗

Log Mean National 000083 002621Mobility (t) 40000385∗∗ 40002105∗∗∗

Log Local Mobility 000002 000124Dispersion (t) 40000185 40000285∗∗∗

Log National Mobility 000012 000678Dispersion (t) 40000125 40000915∗∗∗

Log Number of −0000011 −0000006 −0000003 −0000006 000073 000073 000066 000075Voice Calls (t) 400000155 400000055 400000045 400000055 40000065∗∗∗ 40000065∗∗∗ 40000085∗∗∗ 40000065∗∗∗

Log Upload Freq. 000405 000408 000409 000408 000052 000052 000046 000047by NN (t−1) 40000615∗∗∗ 40000615∗∗∗ 40000615∗∗∗ 40000615∗∗∗ 40000155∗∗∗ 40000155∗∗∗ 40000155∗∗∗ 40000155∗∗∗

Log Download Log Download 006671 006673 006672 006673 504e−6 507e−6 503e−6 502e−6Frequency (t) Freq. (t−1) 40000215∗∗∗ 40000215∗∗∗ 40000215∗∗∗ 40000215∗∗∗ 4707e−75∗∗∗ 4707e−75∗∗∗ 4707e−75∗∗∗ 4707e−75∗∗∗

Log Upload −000281 −000279 −000277 −000276 −000011 −000011 −000012 −000012Freq. (t−1) 40001085∗∗∗ 40001085∗∗ 40001085∗∗∗ 40001085∗∗∗ 40000035∗∗∗ 40000035∗∗∗ 40000035∗∗∗ 40000035∗∗∗

Log Mean Local 000326 000057Mobility 40000675∗∗∗ 40000065∗∗∗

Log Mean National 000439 000432Mobility (t) 40001445∗∗∗ 40000205∗∗∗

Log Local Mobility 000092 000007Dispersion (t) 40000695 40000025∗∗∗

Log National Mobility 000104 000120Dispersion (t) 40000455∗∗∗ 40000095∗∗∗

Log Number of 000009 000011 000013 000010 000010 000010 000011 000011Voice Calls (t) 40000025∗∗∗ 40000025∗∗∗ 40000025∗∗∗ 40000025∗∗∗ 40000015∗∗∗ 40000015∗∗∗ 40000015∗∗∗ 40000015∗∗∗

Log Download Freq. 000275 000283 000285 000278 700e−5 700e−5 700e−5 700e−5by NN (t−1) 40000375∗∗∗ 40000375∗∗∗ 40000375∗∗∗ 40000375∗∗∗ 4207e−55∗∗∗ 4207e−55∗∗∗ 4207e−55∗∗∗ 4207e−55∗∗∗

Notes. We included each one of the two mean mobility metrics and the two mobility dispersion metrics in our main equations at both the local and nationallevels. NN refers to network neighbors. Estimates for time-period fixed effects and effects of mean uploading and downloading frequency of all other users inuser i ’s billing zip code are not reported for brevity.

∗∗Significant at 0.05; ∗∗∗significant at 0.01.

Table C.3 GMM-Based Dynamic Panel Data Model Results on Content Share

Dependent variable Explanatory variable Coefficient

Upload Share (t) Download Share (t−1) −001190 −001199 −001199 −00119140001205∗∗∗ 40001205∗∗∗ 40001205∗∗∗ 40001205∗∗∗

Mean Local Mobility (t) 00000640000025∗∗∗

Mean National Mobility (t) 00003340000095∗∗∗

Local Mobility Dispersion (t) 00000240000015∗∗

National Mobility Dispersion (t) 00000440000025∗∗

Number of Voice Calls (t) 800e−5 600e−5 700e−5 400e−54200e−55∗∗∗ 4200e−55∗∗∗ 4200e−55∗∗∗ 4100e−55∗∗∗

Upload Share by NN (t−1) 000797 000798 000801 00080040000475∗∗∗ 40000475∗∗∗ 40000475∗∗∗ 40000475∗∗∗

Notes. We included each one of the two mean mobility metrics and the two mobility dispersion metrics in ourmain equations at both the local and national levels. NN refers to network neighbors. Estimates for time-periodfixed effects are not reported for brevity.

∗∗∗Significant at 0.01.

Page 20: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile Internet1690 Management Science 57(9), pp. 1671–1691, © 2011 INFORMS

Table C.4 3SLS Estimation Results on Content Size

Dependent variable Explanatory variable Coefficient

Log Upload Size (t) Log Download Size (t−1) −00016640000105∗∗∗

Log Upload Size by NN (t−1) 00002740000395

Selection (t) 00015940000235∗∗∗

Log Download Size (t) Log Upload Size (t−1) −00047740000335∗∗∗

Log Download Size by NN (t−1) 00003440000725

Selection (t) 00053440000415∗∗

Notes. We used 13-week sample. NN refers to network neighbors. Estimatesfor time-period fixed effects are not reported for brevity.

∗∗Significant at 0.05; ∗∗∗significant at 0.01.

ReferencesAlbuquerque, P., P. Pavlidis, U. Chatow, K. Chen, Z. Jamal, K. Koh.

2010. Evaluating promotional activities in an online two-sidedmarket for user-generated content. Working paper, Universityof Rochester, Rochester, NY.

Aral, S., L. Muchnik, A. Sundararajan. 2009. Distinguishinginfluence-based contagion from homophily-driven diffusionin dynamic networks. Proc. National Acad. Sci. USA 106(51)21544–21549.

Arellano, M. 1987. Computing robust standard errors for within-groups estimators. Oxford Bull. Econom. Statist. 49(4) 431–434.

Banerjee, S., R. R. Dholakia. 2008. Does location based advertisingwork? Internat. J. Mobile Marketing 3(2) 68–74.

Baum, C. F. 2007. CHECKREG3: Stata module to check identifica-tion status of simultaneous equations system. Statistical Soft-ware Components, Department of Economics, Boston College,Boston. http://econpapers.repec.org/software/bocbocode/s456877.htm.

Baye, M., J. Rupert, J. Gatti, P. Kattuman, J. Morgan. 2009. Clicks,discontinuities, and firm demand online. J. Econom. Manage-ment Strategy 18(4) 935–975.

Becker, G. S. 1965. A theory of the allocation of time. Econom. J.75(299) 493–517.

Biørn, E. 2004. Regression systems for unbalanced panel data:A stepwise maximum likelihood procedure. J. Econom. 122(2)281–291.

Dasgupta, K., R. Singh, B. Viswanathan, D. Chakraborty,S. Mukherjea, A. Nanavati, A. Joshi. 2008. Social ties and theirrelevance to churn in mobile telecom networks. EDBT’08 Proc.11th Internat. Conf. Extending Database Tech., ACM, New York,668–677.

Dekimpe, M. G., D. M. Hanssens. 1995. The persistence of market-ing effects on sales. Marketing Sci. 14(1) 1–21.

Economist, The. 2010. The mobile data challenge: A report from theEconomist Intelligence Unit. http://graphics.eiu.com/upload/eb/Innopath_MobileData_WEB.pdf.

Ghose, A., S. P. Han. 2009. A dynamic structural model of userlearning on the mobile Internet. Working paper, New YorkUniversity, New York. http://ssrn.com/abstract=1485049.

Ghose, A., A. Goldfarb, S. Han. 2011. How is the mobile Internetdifferent? Search costs and local activities. Working paper, NewYork University, New York. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1732759.

Greene, W. 2007. Fixed and random effects models for countdata. Working Paper NYU EC-07-16, Stern School of Business,New York University, New York.

Hartmann, W. R., P. Manchanda, H. Nair, M. Bothner, P. Dodds,D. Godes, K. Hosanagar, C. Tucker. 2008. Modeling social inter-actions: Identification, empirical methods and policy implica-tions. Marketing Lett. 19(3) 287–304.

Heckman, J. J. 1979. Sample selection bias a specification error.Econometrica 47(1) 153–161.

Hill, S., F. Provost, C. Volinsky. 2006. Network-based marketing:Identifying likely adopters via consumer networks. Statist. Sci.21(2) 256–276.

Homans, G. C. 1958. Social behavior as exchange. Amer. J. Sociol.63(6) 597–606.

Iyengar, R., C. Van den Bulte, T. W. Valente. 2011. Opinion leader-ship and social contagion in new product diffusion. MarketingSci. 30(2) 195–212.

Jacoby, J., G. J. Szybillo, C. K. Berning. 1976. Time and consumerbehavior: An interdisciplinary overview. J. Consumer Res. 2(1)320–339.

Lahiri, K., P. Schmidt. 1978. On the estimation of triangular struc-tural systems. Econometrica 46(5) 1217–1221.

Lancaster, T. 2000. The incidental parameters problem since 1948.J. Econometrics 95(2) 391–414.

Manski, C. 1993. Identification of endogenous social effects. Rev.Econom. Stud. 60(3) 531–542.

Mundlak, Y. 1978. On the pooling of time series and cross sectiondata. Econometrica 46(1) 69–85.

Nair, H. S., P. Manchanda, T. Bhatia 2010. Asymmetric social inter-actions in physician prescription behavior: The role of opinionleaders. J. Marketing Res. 47(5) 883–895.

Nam, S., P. Manchanda, P. K. Chintagunta. 2010. The effect of signalquality and contiguous word of mouth on customer acquisitionfor a video-on-demand service. Marketing Sci. 29(4) 690–700.

Nerlove, M. 1967. Experimental evidence on the estimation ofdynamic economic relations from a time series of cross sec-tions. Econom. Stud. Quart. 18(3) 42–74.

Nickell, S. 1981. Biases in dynamic models with fixed effects. Econo-metrica 49(6) 1417–1426.

Oestreicher-Singer, G., A. Sundararajan. 2010. The visible hand?Demand effects of recommendation networks in electronicmarkets. Working paper, Tel Aviv University, Tel Aviv, Israel.

O’Hara, K., A. Mitchell, A. Vorbau. 2007. Consuming video onmobile devices. Proc. SIGCHI Conf. Human Factors Comput. Sys-tems, ACM, New York, 857–866.

Osborne, M. 2007. Consumer learning, switching costs, andheterogeneity: A structural examination. EAG Discussionspaper 200710, Antitrust Division, Department of Justice,Washington, DC.

Puhani, P. 2000. The heckman correction for sample selection andits critique. J. Econom. Surveys 14(1) 53–68.

Rethink Wireless. 2010. AT&T will use new device formats to intro-duce usage-based pricing. Accessed November 1, http://www.rethink-wireless.com/article.asp?article_id=2722.

Shim, J. P., S. Park, J. M. Shim. 2008. Mobile TV phone: currentusage, issues, and strategic implications. Indust. Managementand Data Systems 108(9) 1269–1282.

Stewart, M. B. 2006. Maximum simulated likelihood estimation ofrandom—Effects dynamic probit models with autocorrelatederrors. Stata J. 6(2) 256–272.

Stewart, M. B. 2007. The interrelated dynamics of unemploymentand low-wage employment. J. Appl. Econometrics 22(3) 511–531.

Stoeckl, R., P. Rohrmeier, T. Hess. 2007. Motivations to produceuser generated content: Differences between webloggers and

Page 21: An Empirical Analysis of User Content Generation and ...people.stern.nyu.edu/aghose/mobilemedia_print.pdfThird, mobile Internet usage behavior of social network neighbors positively

Ghose and Han: User Content Generation and Usage Behavior on the Mobile InternetManagement Science 57(9), pp. 1671–1691, © 2011 INFORMS 1691

video bloggers. BLED 2007 Proc., Paper 30, http://aisel.aisnet.org/bled2007/30.

Susarla, A., J.-H. Oh, Y. Tan. 2011. Social networks and the diffu-sion of user-generated content: Evidence from YouTube. Inform.Systems Res., ePub ahead of print April 8, http://is.journal.informs.org/cgi/content/abstract/isre.1100.0339v1.

Trusov, M., A. V. Bodapati, R. E. Bucklin. 2010. Determining influ-ential users in Internet social networks. J. Marketing Res. 47(4)643–658.

Tucker, C. 2008. Identifying formal and informal influence in tech-nology adoption with network externalities. Management Sci.54(12) 2024–2038.

Vanhonacker, W. R., V. Mahajan, B. J. Bronnenberg. 2000. The emer-gence of market structure in new repeat-purchase categories:The interplay of market share and retailer distribution. J. Mar-keting Res. 37(1) 16–31.

Verbeek, M. 1990. On the estimation of a fixed effects model withselectivity bias. Econom. Lett. 34(3) 267–270.

Verbeek, M., T. Nijman. 1996. Incomplete panels and selec-tion bias. L. Mátyás, P. Sevestre, eds. The Econometricsof Panel Data: A Handbook of the Theory with Applications.Kluwer Academic Publishers, Dordrecht, The Netherlands,449–490.

Wooldridge, J. M. 2002. Econometric Analysis of Cross Section andPanel Data. MIT Press, Cambridge, MA.

Xia, M., Y. Huang, W. Duan, A. B. Whinston. 2011. To con-tinue sharing or not to continue sharing? An empiricalanalysis of user decision in peer-to-peer sharing networks.Inform. Systems Res., ePub ahead of print April 8, http://isr.journal.informs.org/cgi/content/abstract/isre.1100.0344v1.

Zabel, J. E. 1992. Estimating fixed and random effects models withselectivity. Econom. Lett. 40(3) 269–272.


Recommended