Good, bad, and ugliness of social media

  1. It is essentially being used for self-expression.
  2. Extremely non-democratic.
  3. it is biased.
  4. Creates many accidental activists.
  5. Make closer friends far, and far friends closer.
  6. Reduce everything to one-person one-moment (immediacy).
  7. Corrupted by celebrities.
  8. Users are running somebody else’s business.
  9. Users first post and then think.

sauble-beach

  1. Man’s mind, values and goals might be trap. (Monkey fist)
  2. No voice will be lost in the Universe. (Boomerang)
  3. A slow learner  will be re-trained and re-trained until (s)he learns. (Samsara)
  4. The Universe is upon fear. (Rumi)
  5. The Universe positively responds to gratefulness.
  6. Life is not a multi-objective process. Man cannot satisfy all.
  7. Man has to leave one state to enter a new state.
  8. A sparse model for life is more effective. (A.G.B law)
  9. Time is only measurable variable with unlimited value.
  10. Man may live once, twice, or many times, but  can keep one and only one  life experience in his mind.
  11. Boredom (Buddha version: Desire) is the root cause of all evil.
  12. Procrastination is the second root cause of all evil.
  13. Self-control may not be a virtue but necessary but absolutely necessary.
  14. Man cannot achieve without solitude.

(to be continued)

For decades, physicists are looking for a universal theory that unifies essential theories such as gravity and relativity. Do engineers especially in the domain of computing and information systems have a universal model to explain all tasks, techniques, models, and problems?

I believe the answer is filter theory. I personally was not able to find even a single counter example. From data transmission to data processing to data storage, encryption, recommendation, and prediction, all can be explained by filter theory.

When the innate characteristic of all our models is filtering, it means our main focus is on channels. A channel could be a transmission line, memory, a learning machine, or a compression algorithm. We spend a great deal of effort and off course academic publishing to design trustworthy channels or filters. Trustworthy channels don’t lie, don’t betray the source, or mislead the destination. In other words, our focus has been designing honest channels. But these all are based on one strong assumption: source is good, trustworthy, and honest.

Almost in all models, the honesty of the source is out of context. In the case of natural and man-made sources such as sensors, engineers model and estimate sampling and measurement error. But it is obvious, dishonesty is different from error. In social context, error is an honest mistake but with good faith. Error is not because of bad intention and mostly caused by poor judgement.

If honesty means: “a sincere intention to deal fairly with others,” then dishonesty has two features to be detected: insincere intention and lack of fairness. Unfortunately in many social and human behavior studies which are based on data collection from questionnaires and surveys, these two variables are hard to be measured. As a matter of fact, we humans demonstrate our true intentions and opinions when we unintentionally behave or express our views. It means we don’t manipulate our thought for the sake of interests or preventing threats. But most lab experiments, surveys, and questionnaires are based on this fact that participants know they are under study.

When we know the consequence of our action, our behavior dramatically changes. This is why we rely on body language more than the oral language. Body language is an honest language. And this is why politicians always stand behind podium to hide their true intention and express “what they have to say” and “not what they really want to say.”

Social media provide us a unique opportunity to collect “honest data”.
The honesty of social data can be measured by the following criteria:

1. In most cases, users are publishing their opinions, express their intentions, and behave without calculating the consequences. We should note that social data is not based on one incident but based on the frequency of a behavior and repeating similar pattern again and again. A good example is “like” feature in face book. A user may “like” tens of items per day and over thousands each year.
2. Social data is and has to be big. One reason has been addressed in the first criterion. The second is by increasing the size of data the likelihood of noise and outlier is reduced.
3. Social data is and has to be based on heavy users. Infrequent users are usually either very conservative (calculating consequences and hiding true intentions) or they are not very serious.
4. Social data is interactive. It means by interaction more hidden behaviors are revealed.
5. Social data have to be collected from independent sources. Not all social data are honest. If the opinions are not from independent sources, the credibility and honesty of the data are questionable.

 

[previously posted on Social Computer Research Networking Group in LinkedIn]

If you have not read the Steven Strogatz‘s fascinating book: “Sync: How Order Emerges From Chaos In the Universe, Nature, and Daily Life“, I strongly recommend it.  The author has a short TED talk briefly outlining the idea.

Also there is a good documentary on the core idea of the book on CBC/ideas by Paul Kennedy, also highly recommended.

My question is if we can apply this idea in predicting social events and trends which might be synchronized. This one is a good example of a similar idea.

A group of researchers at the University of Calgary have conducted an interesting research on countdown timers at intersections to see if they ever improve road safety and reduce collisions and car crashes. The result was very surprising. The device causes more crashes. The link is attached to this post. The explanation is convincing. But we can explain this surprising result from completely different aspect by social structures and drivers’ social behavior.

When we drive our cars in roads and highways, we, unintentionally, form a social structure by other drivers who drive in vicinity. Our social interaction with them is not to get very close in order to prevent any crash. Our collective behavior is to drive distant enough to have a safe drive. All these happen under a social and collective judgment which works properly most of the time. Adding technology such as countdown timer can distract the collective judgment because every driver has own interpretation from the number displayed on the timer.

The lesson we learn is that we have to be very cautious when we implement social recommendation tasks such as link and friend recommendation. The risk might be damaging organic social fabric. This is the criticism that we may have to “people you may know“ feature in LinkedIn and similar features in Facebook and Twitter. It seems the only objective is growing the network as fast as possible even with the cost of health of network.

Good examples of organic social fabric as channels to relay social and collective behavior are bird flocks and fish schools. There is no reported collisions or crash in bird flocks or fish schools although sometime tens of thousands of these species move in sync and harmony together. This might be because there are no nagging kid on the back seat, or catchy billboard ads, or sidewalk distractions, but the main reason is nothing distracts the collective behavior of birds and fish and this allows them behave based on their instinct.

Hi,

I am starting this blog as a unified environment for my personal web site (including my academic web site) and daily blog posts. I believe wordpress is a good platform for this purpose.

On th top pf this site (menue) you can find my permanent (timeless) pages including my contact info.

 

1. Node Classification in Social Networks

by: Smriti Bhagat, Graham Cormode, S. Muthukrishnan

2. Lexicon-based methods for sentiment analysis

by: Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, Manfred Stede

3. Sentiment in Twitter events

by: Mike Thelwall, Kevan Buckley, Georgios Paltoglou

4. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena

by: Johan Bollen, Alberto Pepe, Huina Mao

5. Learning to Classify Threaten E-mail

by: Subramanian A. Balamurugan, Ramasamy Rajaram

 6. Twitter mood predicts the stock market

by Johan Bollen, Huina Mao, Xiao-Jun Zeng

7. Predicting Positive and Negative Links in Online Social Networks

by: Jure Leskovec, Daniel Huttenlocher, Jon Kleinberg

 8. You Are Who You Talk To: Detecting Roles in Usenet Newsgroups

by: D. Fisher, M. Smith, H. T. Welser

 9. Supervised Machine Learning Applied to Link Prediction in Bipartite Social Networks

by: Nesserine Benchettara, Rushed Kanawati, Celine Rouveirol

 10. Link Propagation: A Fast Semi-supervised Learning Algorithm for Link Prediction

by: Hisashi Kashima, Tsuyoshi Kato, Yoshihiro Yamanishi, Masashi Sugiyama, Koji Tsuda

 11. Suggesting friends using the implicit social graph

by: Maayan Roth, Assaf B. David, David Deutscher, Guy Flysher, Ilan Horn, Ari Leichtberg, Naty Leiser, Yossi Matias, Ron Merom

 12. Cold start link prediction

by: Vincent Leroy, Barla B. Cambazoglu, Francesco Bonchi

 13. Normalized Information Distance

by: Paul M. B. Vitanyi, Frank J. Balbach, Rudi L. Cilibrasi, Ming Li

14. Clustering by Compression

by: R. Cilibrasi, P. M. B. Vitanyi

15. An Iterative Hybrid Filter-Wrapper Approach to Feature Selection for Document Clustering

by: Mohammad-Amin Jashki, Majid Makki, Ebrahim Bagheri, Ali Ghorbani

 16. Estimating the number of clusters in a dataset via the Gap statistic

by: Robert Tibshirani, Guenther Walther, Trevor Hastie

17. Measures for Short Segments of Text

by: Donald Metzler, Susan Dumais, Christopher Meek

 18. Gender, Genre, and Writing Style in Formal Written Texts

by: Shlomo Argamon, Moshe Koppel, Jonathan Fine, Anat R. Shimoni

 19. A Social Network Analysis Approach to Detecting Suspicious Online Financial Activities

by: Lei Tang, Geoffrey Barbier, Huan Liu, Jianping Zhang

20. Using Social Network Analysis for Spam Detection

by: Dave DeBarr, Harry Wechsler

21. Information Distance

by: Bennett, Gacs, Li, Vitanyi, Zurek

22. Information Distance and Its Applications

by: Ming Li

 23. Combining Labeled and Unlabeled Data with Co-Training

by: Avrim Blum, Tom Mitchell

24. Learning from Imbalanced Data Sets: A Comparison of Various Strategies

by: Nathalie Japkowicz

25. One-class svms for document classification

by: Larry M. Manevitz, Malik Yousef