BLOG - Page 3 of 3 - Big Data Partnership

Our Thoughts

How to analyse your customers social profile in 24 hours (Part II – analysis)

(This is the second part of the post – How to analyse your customer social profile in 24 hours – Data and Collection) Community Level After collecting the data as described in the previous post, we can look into the data and visualize some aspects of it. There are many questions we can ask of this data,…

Read More →

How to analyse your customers social profile in 24 hours (Part I – assumptions and data collection)

Social profiles tell us a lot about the interest of its owner and also about people/organisations they follow and people who are following them. This blog post is a summary of what information you can get by collecting and analysing your customer profiles in 24 hours. In fact, after unlocking of the data, this process…

Read More →

Hadoop becomes Mainstream

Hadoop is a grassroots phenomenon that emerged in the social networking and consumer Internet world. As always, there are early adopters who take risks on the cutting edge, and there are more conservative organizations watching the pioneers from the sidelines. This played out in 2011 as early customer experiences with Hadoop were shared via conferences,…

Read More →

Yarn

“Introducing YARN�? – Hadoop No More a Baby Elephant

With the increasing popularity and the addiction of companies towards Hadoop, also Hadoop being an unanimous solution for Big data platforms makes the Hadoop development team to focus on the current architectural deficiencies and make Hadoop free from such underlying architectural issues. In that path a new Hadoop MapReduce version has taken birth MapReduce 2.0…

Read More →

Map Side and Reduce Side Joins

Joins:- ======= Joins is one of the interesting features available in MapReduce. Joins performed by Mapper are called as Map-side Joins. Joins performed by Reducer can be treated as Reduce-side joins. Frameworks like Pig, Hive, or Cascading has support for performing joins. Before diving into the implementation let us understand the problem throughly. If we…

Read More →

Bloom Filter Vs Feature Hashing

Bloom Filter A Bloom filter is a space-efficient probabilistic data structure that is used to efficiently encode sets and perform set membership tests, whether an element is a member of a set. False positives are possible, but false negatives are strictly not possible. i.e. a query returns either “inside set (may be wrong)�? or “definitely…

Read More →

Clustering with Mahout

Clustering Introduction:- Clustering is one of the most popular techniques available in Machine learning field. This allows the system to group numurous entities into separate clusters/groups based on certain characteristics/features of the entities. Clustering is a widely used technique in many grouping problems like grouping similar news articles, blogs, emails, malwares etc based on their…

Read More →

Recommending from big data

As the research on core recommender systems progresses and matures, it becomes clear that a fundamental issue for these algorithms is to determine how to embed the core techniques in real operational systems and how to deal with massive and dynamic sets of data. Recommender system algorithms are very effective in identifying and predicting user preferences based on explicit or implicit indication of preference that…

Read More →

blogging-for-business2

Let the discussions begin……

At Big Data Partnership we love to tell you about what we are up to and what interesting things we’ve seen or heard. But we prefer to hear about the things you have have heard and what you think about Big Data and even Big Data Partnership. Over the coming weeks we will be coming…

Read More →

Back to Top