class: right middle ## R vs Stata R.Andrés Castañeda --- class: center, middle ## From Stata to R --- class: center, middle ## Why R? --- ## R vs Stata ### .f4[Differences] * R is object oriented while Stata is action oriented: + Classic example: Stata's `summarize` vs R's `summary()` + In Stata you declare what you want to do, while in R you usually declare the result you want to get -- * R needs to load non-base commands (packages) at the beginning of each session + Imagine that in Stata you'd have to load a command installed with `ssc install` every time you'll use it in a new session + There is a way to bypass this --- ## R vs Stata ### .f4[Differences] * R is less specialized: -- more flexibility and more functionalities. -- * R has a much broader network of users: + More resources online, which makes using Google a lot easier. You'll never want to see Statalist again in your life! + Development of new features and bug fixes happen faster. --- ## Advantages of Stata - Stata is more specialized: -- Some common but complex tasks are simpler -- - Stata has wider adoption among micro-econometricians (though R adoption is steadily increasing). -- - The implementation of frames in Stata 16 has made everything easier -- - Super intuitive -- - It is getting better and better with each upgrade ??? Running a regression with clustered standard errors Analyzing survey data with weights Network externalities in your work environment. Development of new specialized techniques and tools could happen faster (e.g. ietoolkit). --- ## Advantages of R - R is a free and open source software! -- - It allows you to have several datasets open simultaneously. + No need to use `keep`, `preserve`, `restore`, or `tempfile` -- - It can run complex Geographic Information System (GIS) analyses. -- - You can use it for web scrapping. -- - You can run machine learning algorithms with it --- ## Advantages of R - You can create complex Markdown documents. This presentation, for example, is entirely done in R. - Latex, slides, websites, books, reports, Word documents, PPT presentations, etc. -- - It reads and works with no rectangular data -- - You can create interactive dashboards and online applications with the Shiny package. --- ## Advantages of R - Help online is amazing! -- - [RStudio](https://www.rstudio.com/) exists + Everything is integrated. The interface, the data, the code, git, and the outcome. -- - R is constantly updated and fixed. --- ## R vs Python .center[.f3[.light-red[Disclaimer: ]I don't know enough about Python to make a fair assessment]] --- ## R vs Python ### .f4[In general] - Python and R were created with different objectives. + Python was created as a general programming language + R was created for statistical analysis -- - Over time both have moved slowly to the each other's side. -- - You can find a "fair" comparison in this [PDF created by DataCamp](https://s3.amazonaws.com/assets.datacamp.com/email/other/Python+vs+R.pdf). -- + .orange[I say "fair" because it is too general, and not specific for economists and social scientists.] --- ## R vs Python - I Prefer R because it was/is designed for and by academics, noy by developers. -- - If there is something that I need to do in Python, I just run Python from R, get the results, and keep working on R -- (same thing with Stata). -- - Contrary to what some people say, I think R syntax is easier than Python's. -- - Coding with `tidyverse` or `data.table` syntax make the work extremely easy. --- ## R vs Python - The whole `tidyverse` and RStudio tools are just amazing. There is nothing like that in Python! -- - You should learn both! --- ## ... BUT! If I were going to teach my kids one programming language, it will Python. --- class: center, middle ![](https://media.giphy.com/media/dILrAu24mU729pxPYN/giphy.gif) --- ## Well, because... -- - If Python keeps getting more popularity than R, it will be more attractive in their CVs. -- - It is a general programming language that allows them to create anything from scratch; not only useful for data analysis. -- - I will teach them R, anyway... --- class: center, middle ## Everything you need to start working with R --- ## Installation You need R and RStudio installed in your computer: ### Instructions * Please visit (https://cran.r-project.org) and select a Comprehensive R Archive Network (CRAN) mirror close to you. * To install RStudio, go to (https://www.rstudio.com/). .red[You need to install R first.] --- layout:true ## Data in R --- ### In Stata: * You can open __one dataset__ and perform operations that can change that dataset. * You can also have other things, such as matrices, macros and tempfiles, but they are secondary. __Most functions only use the main dataset__. * If you wish to do any non-permanent changes to your data, __you'll need to preserve the original data to keep it intact__. --- ### In R: __R__ works in a completely different way: * You can load __as many datasets as you wish__ or your computer's memory allows * Operations will have lasting effects __only if you store them__ --- ### In R: * Everything that exists in R's memory -- variables, datasets, functions -- __is an object__ -- * You could think of an object like a chunk of data with some properties that has a name by which you call it -- * If you create an object, it's going to be stored in memory until you delete it or quit R -- * Whenever you run anything you intend to use in the future, __you need to store it as an object__. --- class: middle center ## Question?