Thursday, 28 March 2013

IT & BA LAB Session 10: 26/03/2013

IT & BA LAB Session 10: 26/03/2013

Assignment 1: Create 3 vectors, x, y, z and choose any random values for them, ensuring they are of equal length,
T<- cbind(x,y,z)
Create 3 dimensional plot of the same (of all the 3 types as taught) 

Commands : 
> Random1<-rnorm(30,mean=0,sd=1)
> Random1
> x<-Random1[1:10]
> x
> y<-Random1[11:20]
> y
> z<-Random1[21:30]
> z
> T<-cbind(x,y,z)
> T
> plot3d(T[,1:3])

 > plot3d(T[,1:3],col=rainbow(64))


> plot3d(T[,1:3],col=rainbow(64),type= 's')


 Screenshots:
Assignment no 2:
Read the documentation of rnorm and pnorm,
Create 2 random variables
Create 3 plots:
1. X-Y
2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories) Hint: ?factor
3. Color code and draw the graph
4. Smooth and best fit line for the curve

Commands : 
> x<-rnorm(200,mean=5,sd=1)
> y<-rnorm(200,mean=3,sd=1)
> z1<-sample(letters,5)
> z2<-sample(z1,200,replace=TRUE)
> z<-as.factor(z2)
> t<-cbind(x,y,z)
> qplot(x,y)



 > qplot(x,z,alpha=I(2/10))

 > qplot(x,z)
> qplot(x,y,geom=c("point","smooth"))
> qplot(x,y,colour=z)
> qplot(log(x),log(y),colour=z)
Screenshots

Saturday, 23 March 2013

IT & BA Lab session 9:19/3/2013

DATA VISUALIZATION & INFOGRAPHICS.

Tool used::zeebly.com and visually
 Our society is very socialized where people have become addicted to their facebook pages  and linkedin profiles, in this competitive world where time is money it is important for us to manage our time well. People who stick to their social networking sites for long time like me  get bored seeing the same old plain pages and it takes a sufficient time for them to explore a profile and read everything about them this is a old fasioned technique. But now it is easy for me because i have come across this site called 

http://visual.ly/-which says tell stories with dataNow any kind of data in my opinion is always an un-welcomed guest..!!! But this dogma of mine was challenged & compelled to change when I went through this website. One could present his/her resume in colourful, comical & cool stylish pictographic designs. I know maybe I will never use such a resume in any company's interview, but even to have it on my blog or FB page would be a shining medal with promising likes on my posts.

Another similar site I came across is zeebly.com
and analyzing my FB Page has never been a revelation like this : http://www.zeebly.com/social_me/369805/all3/479acc48363

zeebly.com
my facebook proflile summary is provided this photo is my profile picture and taking about my likes and my intersts 






This shows my total no of friends with the percentages of female and male friend this is very helpful because this sorts data out according to age groups.



 Usage::
This tool is very interesting and attractive, the data is all provided in piecharts and bargraphs . The data is all graphical and very precise people like this and it is very easy to look and figure the required information.

Visual.ly


Visual.ly is a community platform for data visualization and infographics. It was founded by Stew Langille, Lee Sherman, Tal Siach, and Adam Breckler in 2011.

Visual.ly is structured as both as a showcase for infographics as well as a marketplace and community for publishers, designers, and researchers. The site allows users to search images through description, tags, and sources in a variety of categories, ranging from Education to Business or Politics.Users can publish infographics to their personal profile, which they can subsequently share through their social networks.

Visual.ly maintains a team of data analysts, journalists, and designers that create infographics and data visualizations using the Visual.ly tools. They are currently developing a tool that allows anyone to create and publish their own data visualizations.Through this tool, users will be able to gather information from databases and APIs in an automated service to produce an infographic. 

By tapping into Visually's vibrant community of more than 35,000 designers, Marketplace is able to match infographic commissioners – brands, companies, agencies – with designers, Once matched, commissioners have direct access to the designers working on their projects and can communicate and transact with them in Visually's Project Center. Through such unique features as the Project Timeline, commissioners always know where their project stands and can ensure that it stays on time and on budget.

Visually partners with the world's leading publications and brands, bringing  tools, community, and talented team to bear data visualization needs, wherever bespoke creation is needed.


Some points that I found were wonderful about this tool were:

UI is very user friendly
it is open source
numerous options regarding visual presentation of different types of data are available
the full tool is available online and it is not necessary to install any software on your PC
it is fast
the results are attractive and elegant
themes and options suiting everyone's style and taste are available.
once the visual presentation of data is ready, all possible options to retain and avail that data are available.
Here is the picture of my resume, hope you will like it.......



Friday, 15 March 2013

Session #8 -12 Mar Assignment

Session #8 -12 Mar Assignment Submission



Problem: 

Perform Panel Data Analysis of "Produc" data

Solution:


There are three types of models:
      Pooled affect model
      Fixed affect model
      Random affect model 

We will be determining which model is the best by using functions:
       pFtest : for determining between fixed and pooled
       plmtest : for determining between pooled and random
       phtest: for determining between random and fixed

The data can be loaded using the following command
data(Produc , package ="plm")
head(Produc)

Pooled Affect Model 

pool <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=("pooling"),index =c("state","year"))
summary(pool)
Fixed Affect Model:

fixed<-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=("within"),index =c("state","year"))
summary(fixed)
Random Affect Model:

random <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=("random"),index =c("state","year"))
> summary(random)


Testing of Model

This can be done through Hypothesis testing between the models as follows:

H0: Null Hypothesis: the individual index and time based params are all zero
H1: Alternate Hypothesis: atleast one of the index and time based params is non zero

Pooled vs Fixed

Null Hypothesis: Pooled Affect Model
Alternate Hypothesis : Fixed Affect Model

Command:

> pFtest(fixed,pool)


Result:
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) 
F = 56.6361, df1 = 47, df2 = 761, p-value < 2.2e-16
alternative hypothesis: significant effects 
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.

Pooled vs Random

Null Hypothesis: Pooled Affect Model
Alternate Hypothesis: Random Affect Model

Command :
> plmtest(pool)

Result:

  Lagrange Multiplier Test - (Honda)
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
normal = 57.1686, p-value < 2.2e-16
alternative hypothesis: significant effects 

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.

Random vs Fixed

Null Hypothesis: No Correlation . Random Affect Model
Alternate Hypothesis: Fixed Affect Model

Command:
 > phtest(fixed,random)

Result:

 Hausman Test
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
chisq = 93.546, df = 7, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent 

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.

Conclusion: 

So after making all the tests we come to the conclusion that Fixed Affect Model is best suited to do the panel data analysis for "Produc" data set.

Hence , we conclude that within the same id i.e. within same "state" there is no variation.