- March 3, 2014
- Vasilis Vryniotis
- . No feedback
Within the earlier article now we have mentioned concerning the Knowledge Envelopment Evaluation approach and now we have seen how it may be used as an efficient non-parametric rating algorithm. On this weblog submit we are going to develop an implementation of Knowledge Envelopment Evaluation in JAVA and we are going to use it to judge the Social Media Recognition of webpages and articles on the net. The code is open-sourced (underneath GPL v3 license) and you may obtain it freely from Github.
Replace: The Datumbox Machine Studying Framework is now open-source and free to obtain. Take a look at the bundle com.datumbox.framework.algorithms.dea to see the implementation of Knowledge Envelopment Evaluation in Java.
Knowledge Envelopment Evaluation implementation in JAVA
The code is written in JAVA and might be downloaded instantly from Github. It’s licensed underneath GPLv3 so be at liberty to make use of it, modify it and redistribute it freely.
The code implements the Knowledge Envelopment Evaluation algorithm, makes use of the lp_solve library to resolve the Linear Programming issues and makes use of extracted information from Net search engine marketing Analytics index with a purpose to assemble a composite social media recognition metric for webpages primarily based on their shares on Fb, Google Plus and Twitter. All of the theoretical elements of the algorithm are lined on the earlier article and within the supply code you’ll find detailed javadoc feedback regarding the implementation.
Under we offer a excessive degree description of the structure of the implementation:
1. lp_solve 5.5 library
With a purpose to remedy the assorted linear programming issues, we use an open supply library known as lp_solve. The actual lib is written in ANSI C and makes use of a JAVA wrapper to invoke the library strategies. Thus earlier than operating the code it’s essential to set up lp_solve in your system. Binaries of the library can be found each for Linux and Home windows and you may learn extra details about the set up on lp_solve documentation.
Please guarantee that the actual library is put in in your system earlier than making an attempt to run the JAVA code. For any downside regarding putting in and configuring the library please consult with the lp_solve documentation.
2. DataEnvelopmentAnalysis Class
That is the principle class of the implementation of DEA algorithm. It implements a public methodology known as estimateEfficiency() which takes a Map of information and returns their DEA scores.
3. DeaRecord Object
The DeaRecord is a particular Object that shops the information of our file. Since DEA requires separating the enter and output, the DeaRecord Object shops our information individually in a approach that DEA can deal with it.
4. SocialMediaPopularity Class
The SocialMediaPopularity is an software which makes use of DEA to judge the recognition of a web page on Social Media networks primarily based on its Fb likes, Google +1s, and Tweets. It implements two protected strategies the calculatePopularity() and the estimatePercentiles() together with two public strategies the loadFile() and the getPopularity().
The calculatePopularity() makes use of the DEA implementation to estimate the scores of the pages primarily based on their social media counts. The estimatePercentiles() methodology will get the DEA scores and converts them into percentiles. Usually percentiles are simpler to clarify than DEA scores; thus after we say that the recognition rating of a web page is 70% it implies that the actual web page is extra in style than the 70% of the pages.
So as to have the ability to estimate the recognition of a specific web page, we will need to have a dataset with the social media counts of different pages. This is sensible since with a purpose to predict which web page is in style and which isn’t, it’s essential to have the ability to evaluate it with different pages on the net. To take action, we use a small anonymized pattern from Net search engine marketing Analytics index supplied in txt format. You may construct your individual database by extracting the social media counts from extra pages on the net.
The loadFile() methodology is used to load the aforementioned statistics on DEA and the getPopularity() methodology is a simple to make use of methodology that will get the Fb likes, Google +1s and the variety of Tweets of a web page and evaluates its recognition on social media.
Utilizing the Knowledge Envelopment Evaluation JAVA implementation
Within the DataEnvelopmentAnalysisExample Class I present 2 totally different examples of how one can use the code.
The primary instance makes use of instantly the DEA methodology to judge the effectivity of organizational models primarily based on their output (ISSUES, RECEIPTS, REQS) and enter (STOCK, WAGES). This instance was taken from an article of DEAzone.com.
Map information = new LinkedHashMap<>();
information.put("Depot1", new DeaRecord(new double[]{40.0,55.0,30.0}, new double[]{3.0,5.0}));
//...including extra information right here...
DataEnvelopmentAnalysis dea = new DataEnvelopmentAnalysis();
Map outcomes = dea.estimateEfficiency(information);
System.out.println((new TreeMap<>(outcomes)).toString());
The second instance makes use of our Social Media Recognition software to judge the recognition of a web page by utilizing information from Social Media similar to Fb Likes, Google +1s and Tweets. All social media counts are marked as output and we cross to DEA an empty enter vector.
SocialMediaPopularity rank = new SocialMediaPopularity();
rank.loadFile(DataEnvelopmentAnalysisExample.class.getResource("/datasets/socialcounts.txt"));
Double recognition = rank.getPopularity(135, 337, 9079); //Fb likes, Google +1s, Tweets
System.out.println("Web page Social Media Recognition: "+recognition.toString());
Obligatory Expansions
The supplied code is simply an instance of how DEA can be utilized as a rating algorithm. Listed here are few expansions that should be made with a purpose to enhance the implementation:
1. Rushing up the implementation
The actual DEA implementation evaluates the DEA scores of all of the information within the database. This makes the implementation gradual since we require fixing as many linear programming issues because the variety of information in database. If we don’t require calculating the rating of all of the information then we will pace up the execution considerably. Thus a small growth of the algorithm can provide us higher management over which information must be solved and which must be used solely as constrains.
2. Increasing the Social Media Counts Database
The supplied Social Media Counts Database consists of 1111 samples from Net search engine marketing Analytics index. To have the ability to estimate a extra correct recognition rating, a bigger pattern is important. You may create your individual database by estimating the social media counts from extra pages of the net.
3. Including extra Social Media Networks
The implementation makes use of the Fb Likes, the Google +1s and the variety of Tweets to judge the recognition of an article. However metrics from different social media networks might be simply taken under consideration. All it’s worthwhile to do is construct a database with the social media counts from the networks that you’re all for and develop the SocialMediaPopularity class to deal with them accordingly.
Last feedback on the implementation
To have the ability to develop the implementation it’s essential to have a very good understanding of how Knowledge Envelopment Evaluation works. That is lined on the earlier article, so please ensure you learn the tutorial earlier than you proceed to any modifications. Furthermore with a purpose to use the JAVA code it’s essential to have put in in your system the lp_solve library (see above).
Â
In case you use the implementation in an fascinating undertaking drop us a line and we are going to characteristic your undertaking on our weblog. Additionally when you just like the article, please take a second and share it on Twitter or Fb.
