На самом деле это первый учебник, который я когда-либо писал в Интернете.

Мне очень понравилось писать этот урок, и я планирую написать больше.

Поэтому, пожалуйста, оставьте комментарий, если вам понравилось, или если у вас есть какие-либо предложения по улучшению содержания.

Учебник не затрагивает основ программирования на R, а скорее погружается прямо в ту часть, где начинается самое интересное. Вы можете изучить основы за 2 часа из любого места, включая официальную документацию R (CRAN).

Ускорьте работу с такими темами, как: типы переменных, типы графиков, кадры данных, списки и т. д.

Учебник написан в виде скрипта R с комментариями, так как я думаю, что это самый интерактивный способ визуализации контента. Убедитесь, что вы вносите свои собственные изменения и пробуете настройки и правки, чтобы увидеть, как это повлияет на ваши данные.

Не стесняйтесь копировать код в R Studio. Используйте «CTRL+ENTER» в строке кода, чтобы выполнить эту строку кода.

In short: What is Regression? If I have to put it in a way where you won’t have to get into the details of the mathematics involved, then:
Regression defines the strength of relations between 2 or more variables in a data-set. To put it into even more simpler terms, regression is to fit a line through a set of data points on a graph such that the line is optimized to cover as many points as it can or stay close to as many points as it can. This way, we can follow the line to regions where we don’t have any data recorded to predict the data at that point (which will be effectively following the trends in data from the previous points).
You might refer this for a detailed version: https://www.thebalance.com/what-is-simple-linear-regression-2296697 

Вот сценарий:

#Use help(function-name) to see the description of the function,argument,feature,attribute etc. provided 
#by the official R documentation
#Visit https://cran.r-project.org for a comprehensive documentation
#Go on executing each line of code by bringing the pointer anywhere on the line and pressing "Ctrl+Enter"
#Try changing values and variables to see the changes in effect.
# by - Raghu_Raj_Rai > [email protected]
#Scatter Plots - x vs y type of plot - plot() function
#Loading the women dataset for height and weight data
women
#Loading the 2 data columns in x and y for short names while using the terms
x <- women$height
y <- women$weight
#And here a basic plot of x on y and then y on x (General format of the function is plot(x,y))
plot(x,y)
plot(y,x)
#Read the description of the plot() function provided in the R documentation to know the additional features of the function and parameters
help("plot")
#Using additional parameters in the function. pch defines the shape of plots on the graph. Use help for more
help(pch)
plot(x,y,xlab = "Height",ylab = "Weight",main = "Relation between Heights and Weights",col="Blue",pch=20)
#Basics finished here. Coming to Linear Regression and Multiple Regression model in the graph
#The data is preloaded in the x and y variables which we'll be using now, but just to be clear.
#loading data set into variables
x <- women$height
y <- women$weight
typeof(x)
#Constructing the relation between the 2 variables to form the relation between them
#This defines the parameters of the regression line that we'll pull over the data
relation <- lm(y~x)
#Use summary to obtain the coefficients of the regression lines
summary(relation)
#Creating a plot with the regression line overlay
plot(x,y,xlab = "Height",ylab = "Weight",main = "Relation between Heights and Weights",col="Blue",pch=20,abline(relation))
#Load the data in the following manner to use it as input for "Predict". We can't fit the data directly as the predict function will not recognize wether it's the x feature or y feature that we want to predict
a <- data.frame(x=70)
prediction <- predict(relation,a)
#So now we know that if a women has a height of 70, the weight as per the dataset provided would be roughly 154
prediction
#Points can be overlayed over a previously drawn plot using points() function. Check description
help("points")
#Plot the point on the graph to see wether the predicted data fits on the graph. 
#In this case, it will match a previous plot cause we used an already provided data for prediction
points(70,prediction,col="Red",pch=24)
#The general form of the regression equation is y = ax+b(y-response var,x-predictor var)
#The above example refers to Simple Regression where only 2 variables are involved and we have to obtain a direct relation on them.
#Check the values of relation to see it's intercept and slope
relation
#Multiple Regression - form: y = a + b1.x1 + b2.x2 + . . . . + bn.xn - ( y - response variable , a,b1,b2,bn - coefficients , x1,x2,xn - predictor variables)
#General methodology of execution remains the same except the way some functions are used.
#Loading a different dataset now as we'll need more than 2 variables
data()
#Loading presidents dataset for our experimentation
presidents
#We notice that we have some "NA" in our dataset which might represent that the data related to that particular value might not have been recorded
#We resolve this issue by replacing all "NA"s in our dataset with an average value of 50.
summary(presidents)
#We copy the presidents dataset to our own dataset. Just to keep the workbench clean and organized.
my_dataset <- presidents
#Now we replace all the NAs with 50 in our dataset. 50 is used as an average value. It could be anything in different cases, even 0.
my_dataset[is.na(my_dataset)] <- 50
my_dataset
#Creating the relationship model between the attributes using the lm() function
#We'll be using qtr1,qtr2 and qtr3 as the predictor variables and qtr4 as the response variable

Мы продолжим множественную регрессию и, возможно, в конце рассмотрим основы. (ржу не могу)