
Finally I made my own way and it really worked. Here is the solution to this problem.
Problem:
Country <- c("India","US", "India","US","India","India","India","US","US","UK","US","India")
State <- c ("IM","UM","IM","UA","IM","IH","IH","UM","UM","UKM","UM","IH")
Sales <- c(1245,3212,345,3412,2423,3787,997,12121,12232,12113,45432,12121)
Revenue <- c(124523,121253,23565,565765,56321,121434,3431212,4532312,122345,232542,12345232,2312212)
SalesReport <- data.frame(Country,State,Sales,Revenue)
View(SalesReport)
Country | State | Sales | Revenue |
India | HM | 1245 | 124523 |
UK | KM | 3212 | 121253 |
India | KO | 345 | 23565 |
US | HM | 3412 | 565765 |
India | KM | 2423 | 56321 |
India | KO | 3787 | 121434 |
India | LH | 997 | 3431212 |
US | OL | 12121 | 4532312 |
US | NJ | 12232 | 122345 |
US | CA | 12113 | 232542 |
US | CA | 45432 | 12345232 |
India | CA | 12121 | 2312212 |
I want to rank my data set over Revenue partition by Country and State. So output will be like.
Country | State | Sales | Revenue | Rank |
India | IH | 997 | 3431212 | 1 |
India | IH | 12121 | 2312212 | 2 |
India | IH | 3787 | 121434 | 3 |
India | IM | 1245 | 124523 | 1 |
India | IM | 2423 | 56321 | 2 |
India | IM | 345 | 23565 | 3 |
UK | UKM | 12113 | 232542 | 1 |
US | UA | 3412 | 565765 | 1 |
US | UM | 45432 | 12345232 | 1 |
US | UM | 12121 | 4532312 | 2 |
US | UM | 12232 | 122345 | 3 |
US | UM | 3212 | 121253 | 4 |
SQL Program:
SELECT * RANK() OVER (PARTITION BY Country, State Order by Revenue DESC) as Rank FROM SalesReport GROUP BY Country, State
SAS Program:
proc rank data=SalesReport out=SalesReportRanking
descending
ties=low;
by Country State;
ranks Rank;
var Revenue;
run;
R Program:
1. After creating dataset, I am creating a KEY which is nothing but concatenate of all columns which are needed for partition by.
2. First I make the dataset ordered by these columns.
3. And now making a KEY “ID”
4. Now ranking the dataset by ID. -x for Descending order.
5. Arranging the dataset for better look.
6. Nullifying the ID (No one should know your trick :P) and view it.
SalesReport <- SalesReport[order(Country,State),]
row.names(SalesReport) <- NULL
SalesReport$ID <- paste(SalesReport[,1],SalesReport[,2])
SalesReportRanking <- transform(SalesReport, Rank = ave (Revenue, ID, FUN = function(x)
rank (-x, ties.method ="min")))
SalesReportRanking <- SalesReportRanking[order(SalesReportRanking$ID,SalesReportRanking$Rank),]
SalesReportRanking$ID <- NULL
View(SalesReportRanking)
The above code can be used for a single column as well. Just replace ID with your column name.
Compare your result with SAS output. Here we go. It matches if not please comment below. I hope it helped you a little bit. Thanks for reading.
Another important queries for this solution:
1. How to use Rank() in SQL Server
2. How to use Rank() in R
3. How to emulate SQL “partition by” in R?
4. How to partition when ranking on a particular column?
Thanks a lot Sr. This help me a lot
having an issue doing this by date,