Fda Unit 5
Fda Unit 5
0 indicates positive serial correlation - The error terms will tend to have the same sign from one period to the next period. ©) <0 indicates negative serial correlation - The error terms will tend to have a different sign from one period to the next period. Impure serial correlation * This type of serial correlation is caused by a specification error such as an omitted variable or ignoring nonlinearities. Suppose the true regression equation is given by, YY, = Bo+B, Xj, + By Xq +e The error term €, will capture the effect of X,,. Since many economic variables exhibit trends over time, X,, is likely to depend on X2, ¢_j» X,,¢_7 ++» This will translate into a seeming correlation between €, and €_y, &_ 2» «-- and this serial correlation would violate assumption. A specification error of the functional form can also cause this type of serial correlation. ‘Suppose the true regression equation between Y and X is quadratic. but we assume it’s linear. The error term will depend on X’. The consequences of serial correlation : 1. Pure serial correlation does not cause bias in the regression coefficient estimates. 2. Serial correlation causes OLS to no longer be a minimum variance estimator. 3. Serial correlation causes the estimated variances of the regression coefficients to be biased, leading to unreliable hypothesis testing. 5.5.3 Autocorrelation * Autocorrelation refers to the degree of correlation of the same variables between two successive time intervals, It measures how the lagged version of the value of a variable is related to the original version of it in a time series. The value of autocorrelation varies between + 1 and — 1. If the autocorrelation of series is a very small value that does not mean, there is no correlation. The correlation could be non- linear. A value between — 1 and 0 represents negative autocorrelation. A value between O and 1 represents positive autocorrelation. ‘ © Autocorrelation gives information about the trend of a set of historical data so that it can be usefal in the technical analysis for the equity market. TECHNICAL PUBLIGATIONS® - an up-thrust for knowledge- fundamentals of Data Science and Analytics (5-17) Analytics « Fig. 5.5.1 shows positive and negative autocorrelation. ype a — (a) Positive autocorrelation (b) Negative autocorrelation Fig. 5.5.1 A technical analyst can learn how the stock price of a particular day is affected by those of previous days through autocorrelation. Thus, he/she can estimate how the price will move in the future. «If the price of a stock with strong positive autocorrelation has been increasing for several days, the analyst can reasonably estimate the future price will continue to move upward in the recent future days. The analyst may buy and hold the stock for a short period of time to profit from the upward price movement, «The autocorrelation analysis only provides information about short-term trends and tells little about the fundamentals of a company. Therefore, it can only be applied to support the trades with short holding periods. 4 5.6 Introduction to Survival © Survival analysis is used to analyze data in which the time until the event is of interest. The response is often referred to as a failure time, survival time or event time. © Originally, this branch of statistics developed around measuring the effects of medical treatment on patient’s survival in clinical trials. For example, imagine a group of cancer patients who are administered a certain new form of treatment. Survival analysis can be used for analyzing the results of that treatment in terms of the patients’ life expectancy. Censoring : # Censoring is present when we have some information about a subject’s event time, but we don’t know the exact event time. For the analysis methods we will discuss to be valid, censoring mechanism must be independent of the survival mechanism. TEOPRIGHL PUBLIGATIONS® nop knowFundamentals of Data Science and Analytics (5-18) Brectetive Ansivics * There are generally three reasons why censoring might occur : ‘ & A subject does not experience the event before the study ends b. A person is lost to follow-up during the study period ©. A person withdraws from the study. © These are all examples of right-censoring, © Types of right-censoring :* 1. Fixed type I censoring occurs when a study is designed to end after C years of follow- up. In this case, everyone who does not have an event observed during the course of the study is censored at C years. 2. In random type I censoring, the study is designed to end after C years, but censored subjects do not all have the same censoring time. This is the main type of right- censoring we will be concerned with. 3. In type II censoring, a study ends when there is a pre-specified number of events. ©. The survival fiinction is a function of time (t) and can be represented as : S(t) = Pr(T>t) where Pr() stands for the probability and T for the time of the event of interest for a random observation from the sample. We can interpret the survival function as the probability of the event of interest (for example, the death event) not occurring by the time t. * The survival function takes values in the range between 0 and 1 (inclusive) and is a non- increasing function of t. £2 5.7 Two Marks Questions with Answers Q.1 What is logistic regression ? Ans, : Logistic regression is a form of regression analysis in which the outcome variable is binary or dichotomous. A statistical method used to model dichotomous or binary outcomes using predictor variables. Logistic regression is one of the supervised learning algorithms. Q2 What is omnibus test ? Ans. : The omnibus test is a likelihood-ratio chi-square test of the current model versus the null model. The significance value of less than 0.05 indicates that the current model outperforms the null model. Omnibus tests are generic statistical tests used for checking whether the Variance explained by the model is more than the unexplained variance, TECHNICAL PUBLICATIONS® . en up-thrust for knowledge Serundoments of Dal Scien and Aneytes (5.19) a Predictive Anelytics a3 Define serial correlation, ‘ans. : Serial correlation is the relationship between a given variable and a lagged version of ious time i itself over various ime intervals. It measures the relationship between a variable's current value given its past values. as What are the consequences of serial correlation 7 ‘Ans.t 1. Pure serial correlation does not cause bias inthe regression coefficient estimates. 2. Serial correlation causes OLS to no longer be a minimum variance estimator. 3. Serial sexaition causes the estimated variances of the regression coefficients to be biased, leading to unreliable hypothesis testing. Q.5 Define autocorrelation, ‘Ans. : Autocorrelation refers to the degree of correlation of the same variables between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series. Q.6. What are reasons for censoring 7 ‘Ans, : There are generally three reasons why censoring might occur: a. A subject does not experience the event before the study ends. b. A person is lost to follow-up during the study period. c. A person withdraws from the study. Q.7_ Explain regression using statsmodels. ‘Ans. : Linear regression statsmodel is the model that helps us to predict and is used for fitting up the scenario where one parameter is directly dependent on the other parameter. Here, we have one variable that is dependent and the other one which is independent. Depending on the change in the value of the independent parameter, we need to predict the change in the dependent variable. Q.8 Why residual analysis is important 7 ‘Ans. : Residual (error) analysis is important to check whether the assumptions of regression models have been satisfied. It is performed to check the following : 1. The residuals are normally distributed. 2. The variance of residual is constant (homoscedasticity). 3, The functional form of regression is correctly specified. 4. If there are any outliers. TECHNICAL PUBLICATIONS® - an up-trust for knowledgetive Analytics Fundamentals of Data Science and Analytics —_(5- 20) Predictive Analytic Q.9 What Is spurious regression 7 ther independent Ans. : The regression is spurious when we regress one random. walk onto anol pt a non-existin, random walk, It is spurious because the regression will most likely indicate 1g relationship. 900 TECHNICAL PUBLICATIONS® - an up-thrust for knowledge