COURSES
END OF SEMESTER COMMENTS
★
★
★
★
★
Everything is super good Professor has really good grip on what he is delivering and main thing i want mention is about the teaching assistant for this course Kiran Fetah is superb in helping students in all the way they need . Thanks 
★
★
★
★
★
Professor and TA's gave continuous support. I am gonna really miss this class. 
★
★
★
★
★
The most positive thing is conducting TA office hours which is very useful to clarify doubts for solving assignments There is nothing like negative but the deadline time for assignments and project is bit hectic when it comes to the end of the semester though professor and TA worked a lot to give us more time 
★
★
★
★
★
Both Professor and TA's are very supportive and helpful. 
★
★
★
★
★
The topics were explained very clearly and thanks to professor and TA's help, we were able to grasp and solve problems statistically. They are no negative attributes of this class. 
★
★
★
★
★
Too many assignments made me feel a little stress 
★
★
★
★
★
lot of pratical knowledge learnt. overall good subject and good teaching 
★
★
★
★
★
The positive is Professor way of teaching is at top level but coming to the assignments there was a ambiguity between the professor and TA's explanation 
★
★
★
★
★
I really liked the weekly assignment that allow us to apply the concepts we have learnt in class, I like how professor deliver the entire concepts in class using sceneriors and thing we could very much relate to in everyday life. I very much appreciate the effort of the T.A's in ensuring we get the concept and apply them rightly in solving real word problems through their numerous office hours and rapid responses to personal messages on teams. But sometime the assignments get overwhelming. 
★
★
★
★
★
The good thing is that this gave a good exposure to the applications on real world problems. There is nothing much negative but the only thing is the assignments sometimes are challenging. Overall, I feel very glad that I took this subject. 
★
★
★
★
★
I found it awesome about lectures and TA hours It may be some what fair if you can decrease the workload and reduce the difficulty of assignments Also,if you can consider 10 marks as one grade, i would be satisfied in terms of Results. 
★
★
★
★
★
Overall a good learning experience Thanks 
★
★
★
★
★
Positive: Everything goes on in a sequential pattern. Lecture followed by TAs practical session, assignment, Office hours. The grading was on point and punctual. The TAs were very supportive. Negative: The course load was more because of assignments every week. But, that has helped us in learning alot. 
★
★
★
★
★
Positive Attributes: Helps in the way to make sense of the complex data and to make decision. Learned about the assumptions that underline statistical models and the importance of understanding the limitations of these models. Negative Attributes: Difficulty in communicating the results. 
★
★
★
★
★
I thoroughly enjoyed the course and found it to be an enriching learning experience. The course content was wellstructured, and the topics covered provided a comprehensive understanding of the subject matter. Throughout the course, I gained valuable insights into statistical methods and their application in research. The practical examples and case studies helped in reinforcing the concepts and made the learning process engaging and enjoyable. I particularly appreciated the emphasis on realworld applications, which helped me relate the material to my own research interests. 
★
★
★
★
★
Classes were informative and we had weekly assignments which was a bit stressful but managed to finish them on time. Each one was a new task and helped me learn new things. Our TA's were very helpful. 
★
★
★
★
★
Negatives are more number of assignments. 
★
★
★
★
★
I learned so much from the continuos weekly assessment and projects from real world data... This course will certainly help me a lot in my career ! 
★
★
★
★
★
Positive:

★
★
★
★
★
The course exceeded my expectations in terms of both content and instruction. The professor's, Vitalii's and Fettah's expertise, enthusiasm, and commitment to student success made this an exceptional learning experience, and I am grateful for the opportunity to have taken this course. 
KEY INFORMATION
 Prof. Ioannis Pavlidis (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 34 pm on Fridays @ TEAMS
 Vitalii Zhukov (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 11 am12 pm on Mondays @ TEAMS
 Fettah Kiran (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 1011 am on Fridays @ TEAMS
 13 x 3% Homework
 61% Project
 Friday, 4:007:00 pm @ TEAMS & HBS 315

The project can be done either individually or in pairs.
Pairs need to be declared by the end of the second week of classes.
[1] Donna L. Mohr, William J. Wilson, Rudolf J. Freund. Statistical Methods. 4th Edition. Academic Press, 2021.
[2] Terence C. Mills. Applied Time Series Analysis. 1st Edition. Academic Press, 2019.
COURSE OUTLINE
 Topics to Cover: Situating Statistics, Machine/Deep Learning, and Data Science; observations and variables; types of measurements for variables; distributions; numerical descriptive statistics; exploratory data analysis; bivariate data; data collection
 Topics to Cover: Probability; discrete probability distributions; continuous probability distributions; sampling distributions
 Homework #1 due at 4 pm on 02/03/2023
 Topics to Cover: Hypothesis testing; estimation; sample size; assumptions
 Homework #2 due at 4 pm on 02/09/2023
 Assignment of Projects
 Topics to Cover: Inferences on the population mean; inferences on a proportion; inferences on the variance; assumptions
 Homework #3 due at 4 pm on 02/16/2023
 Topics to Cover: Inferences on the difference between means using independent samples; inferences on variances; inferences on means for dependent samples; inferences on proportions; assumptions and remedial methods
 Homework #4 due at 4 pm on 02/23/2023
 Topics to Cover: Analysis of variance; linear model; assumptions; specific comparisons; random models; unequal sample sizes; analysis of means
 Homework #5 due at 4 pm on 03/02/2023
 Topics to Cover: The regression model; estimation of parameters; inferences for regression; correlation; regression diagnostics
 Homework #6 due at 4 pm on 03/09/2023
 Topics to Cover: The multiple regression model; estimation of coefficients; inferential procedures; correlations; special models; multicollinearity; variable selection; detection of outliers
 Homework #7 due at 4 pm on 03/23/2023
 Topics to Cover: The dummy variable model; unbalanced data; models with dummy and interval variables; weighted least squares; correlated errors
 Homework #8 due at 4 pm on 03/30/2023
 Topics to Cover: Twofactor factorial experiment; randomized block design; randomized blocks with sampling; repeated measures designs
 Homework #9 due at 4 pm on 04/06/2023
 Topics to Cover: Hypothesis test for a multinomial population; goodness of fit using the 𝜒2 test; contingency tables; loglinear model
 Homework #10 due at 4 pm on 04/13/2023
 Topics to Cover: Logistic regression; multinomial regression
 Homework #11 due at 4 pm on 04/20/2023
 Topics to Cover: One sample; two independent samples; more than two samples; rank correlation; the bootstrap
 Homework #12 due at 4 pm on 04/27/2023
 Topics to Cover: Time series and their features; stationary processes (ARMA); nonstationary processes (ARIMA)
 Homework #13 due at 4 pm on 05/04/2023
 Project Reports due at 4 pm on 05/04/2023
WEEKLY GRADES AND STUDENT COMMENTS
Comments from students 
Comments from students 
★
★
★
★
★
Everything is good 
Comments from students 
Comments from students 
Comments from students 
Comments from students 
Comments from students 
Comments from students 
Comments from students 
★
★
★
★
★
All good 
Comments from students 
★
★
★
★
★
Everything is good. Thank you. 
★
★
★
★
★
This HW04 was not good with explanation and was hard to implement. It took a lot of my time to finish that. I hope from next week more description will provide by you and make it a little easier. Moreover, we also have a project to do. 
Comments from students 
★
★
★
★
★
Thanks for the lecture 
★
★
★
★
★
I have some comments on weekly homework as follow: 1 we need rubrics for every homeworks since I missed the same grades for hw1(3 feedback) and hw2 (one feedback)!! 2 we need more details of requirements weekly homework, for example (For the curated PP, HRE4 and HRAW signals construct QQ plots. The plots should be done for each participant and laid out in the twopage format you are familiar with from Q1 in HW 1. The aim is to provide a comparative and insightful account of signal normality to the analyst. State your observations and thoughts) The solution showed to us in class was in four pages before and After and in Q2 mentioned in twopage! 3 we need the final solution of weekly homework. Thank you! 
Comments from students 
★
★
★
★
★
No techincal difficulties this week, which is appreciated. 
★
★
★
★
★
Today's class has lot of useful content. Thanks to professor for all the information provided in today's class about assignments and one more chance to makeup for Homework1 
★
★
★
★
★
Very difficult and time consuming homeworks which require strong knowledge of R programming. Takes an entire week to complete leaving no time for other classes. 
Comments from students 
★
★
★
★
★
Unfortunately the online portion had lots of technical problems today, especially towards the end of the first session and continuing throughout Vitalii's demo. The audio would cut out for minutes at a time. I have no complaints about the content, just the stream itself. 
★
★
★
★
★
The calss is intersenting, however I need time to figure out the homework, I sugget to have one week for each homework since I am taking two classes besides this class. Please.. I have one comment to Vitalii please explain slowly since I am beginner in R language. Professor and Vitali and Fettah are doing greet job. 
★
★
★
★
★
I have zero knowledge in R programming and for someone like me I felt the homeworks are very difficult. 
★
★
★
★
★
Dear professor, I had no experience in R language. The pace of class as I felt is fast but I'm covering it up by listening again the recorded lectures. The assignment, I had felt difficult to solve and moving forward since the deadline is fixed at Wednesday, I'm concerned about the course work. Kindly requesting you to be considerate while giving assignments. 
★
★
★
★
★
Good 
★
★
★
★
★
I felt Assignment was too tough. Firstly , I am new to R and as the fisrt assignment was like more than beginners level. So I have a request that if there is any chance of reducing the assignment level so that I can learn from foundations. 
★
★
★
★
★
I felt Homework 1 was too difficult to complete even the deadline was extended because I am not very much aware of R. I kindly request you to decline the level of difficulty if you can so that I could learn from basics. 
★
★
★
★
★
I feel each assignment should have a duration of 1 week to complete them as it is my first time working with R and it consumes a lot of time. And we also have other subjects to work on. 
★
★
★
★
★
The assignment is so difficult it is like i am understanding the topic but when it comes to assignment it is not helping that much. And the assignment is taking a lot of time which is causing problem for my other courses. 
★
★
★
★
★
Please give one week for every assignment 
★
★
★
★
★
Found the assignment somewhat interesting but di... 
★
★
★
★
★
I feel HW01 is very difficult. I have followed the class and office hours but it was so hard for me to get my head around the assignment. If this is the difficulty level for HW01, I wonder how the level of upcoming homeworks is going to be. 
★
★
★
★
★
Hope to extend the work time. 
★
★
★
★
★
The homework 1 was very very difficult and it is taking a lot of time to solve the questions and even the questions are not 100% understandable to do the task. It is even affecting my other courses as i need to spend a lot of time on this course. 
★
★
★
★
★
I felt that Homework1 was so difficult. As we are new to R we cannot code to that level at the begining itself.I request to reduce the difficulty as the deadline was also a mere 3 days. 
Comments from students 
★
★
★
★
★
Professor and TA explanation is pretty good with examples. 
Comments from students 
★
★
★
★
★
Hope there was more time to explain the examples where form R studio. 
★
★
★
★
★
The professor and TA has taught the class really well. The hands on session is very informative, I'd like to name it 'Crash Course on R' 
★
★
★
★
★
Very good and informative session 
★
★
★
★
★
If time permits, I'd recommend a very short introduction to tidyverse, and all the verbs that come with it. Especially the %>% (pipe operator) and dplyr seem very inuse these days. 
★
★
★
★
★
Good session overall 
★
★
★
★
★
The class pace can be bit slowed down for better understanding. 
★
★
★
★
★
As someone who has never used R, I found the intro a bit rushed. However for most it would likely be fine. As we are given the recording, it allows me to rewatch with pause to absorbe it. \nAlso, I like that you are doing this  obtaining feedback early in the class. My first impression is that this will be unlike any university course I have ever taken, in a good way. I'm excited to see all of how it unfolds but from the materials perspective but also from an academic implementation perspective. \nThank you. 
★
★
★
★
★
The class was interesting 
★
★
★
★
★
Classes are interactive. 
★
★
★
★
★
looks good, but if pace can be a bit slow , i think i can catch up easily. 
★
★
★
★
★
It is good and i am looking forward to learn R much more. 
★
★
★
★
★
The class was good as I learned the basics of R language 
★
★
★
★
★
Looking forward to have a great semester. 
★
★
★
★
★
No, comments everything was good. 
★
★
★
★
★
The explanation is quite good and it would be more interesting if it includes much more kinds of examples. 
★
★
★
★
★
Class was very Useful. Thank you. 
END OF SEMESTER COMMENTS
★
★
★
★
★
The lectures were easy to understand and carefully explained by the professor including the doubts. The presentations that were saved in the blackboard are used as additional references. The TA explained the concepts and code very well. The marks or assignment evaluations were done very strictly, ie, it is 1 mark less if in one line says 16 and another has 14(by mistake), 1 for report quality. I just hope it would have been much better if it was 0.5 rather than 1. Thanks for all the help. 
★
★
★
★
★
Learnt a lot. Thank to professor and TAs. 
★
★
★
★
★
could have been a little linient on the grading. 
★
★
★
★
★
Just an awesome class, but I felt that course load(hw) was a bit much 
★
★
★
★
★
The class was very informative and challenging. I learned a fair good deal from this course. Excited to be taking Ubiquitous Computer next fall under Dr. Pavlidis! 
★
★
★
★
★
Professor Pavlidis and the TAs Vitalii and Shaila instructed the course in an organzied manner, providing a learning enviornment encouraging participation and discussion. Weekly homeworks allowed for a steady practice of the information gained from lecture. Labs were extremly helpful in completing homework assignments. Professor Pavlidis provided interesting datasets from the medical field and educational research. Although I do wish I particpated in the inperson course provided due to the unique personalities of the professor and TAs the online form of the course was extremely convinient and useful. 
★
★
★
★
★
One of the best classes I have ever taken. Professor: Dr. Ioannis T Pavlidis is the best professor as per me. He is always ready to clear your doubts. Getting responses from him is the easiest task even on weekends which makes him unique among all. If you ask him any questions at any moment, he is very happy and eager to answer everything. His knowledge in the field is beyond the limits. Always listen to the feedback provided by the students. Make necessary changes in class delivery also if students needed (until they are reasonable for sure). Definitely, A LOT to learn from him. TAs: Vitalii and Shaila, are the best at their level. For the practical session, you can always reach out to Vitalii, he is always ready to guide you in the right direction. For any grading, project related queries, Shaila is always ready to give you logically correct feedback. With the help of them, you will definitely learn how to write and organize your code and report. Class quality and workload: Definitely, you will have a huge workload as nothing comes easily. When you have time and want to invest it instead of waste it, surely go for this class. You will enjoy it a lot and will learn a lot. After finishing this class, you will 100% feel that it was useful and all the hard work (ofc smart work is needed 😇😇) that you have done gave you fruitful results. Kudos 🙌🏻🙌🏻🙌🏻 to one of the best teams I have ever worked with. 
KEY INFORMATION
 Prof. Ioannis Pavlidis (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 34 pm on Fridays @ TEAMS
 Vitalii Zhukov (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 34 pm on Mondays @ TEAMS
 Shaila Zaman (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 23 pm on Fridays @ TEAMS
 13 x 3% Homework
 61% Project
COURSE OUTLINE
 Topics to Cover: Situating Statistics, Machine Learning, and Data Science; observations and variables; types of measurements for variables; distributions; numerical descriptive statistics; exploratory data analysis; bivariate data; data collection
 Topics to Cover: Probability; discrete probability distributions; continuous probability distributions; sampling distributions
 Homework #1 due at 7 pm on 01/31/2022
 Topics to Cover: Hypothesis testing; estimation; sample size; assumptions
 Homework #2 due at 7 pm on 02/15/2022
 Assignment of Projects
 Topics to Cover: Inferences on the population mean; inferences on a proportion; inferences on the variance; assumptions
 Homework #3 due at 11:59pm on 02/22/2022
 Topics to Cover: Inferences on the difference between means using independent samples; inferences on variances; inferences on means for dependent samples; inferences on proportions; assumptions and remedial methods
 Homework #4 due at 11:59pm on 03/01/2022
 Topics to Cover: Analysis of variance; linear model; assumptions; specific comparisons; random models; unequal sample sizes; analysis of means
 Homework #5 due at 11:59pm on 03/08/2022
 Topics to Cover: The regression model; estimation of parameters; inferences for regression; correlation; regression diagnostics
 Homework #6 due at 11:59pm on 03/22/2022
 Topics to Cover: The multiple regression model; estimation of coefficients; inferential procedures; correlations; special models; multicollinearity; variable selection; detection of outliers
 Homework #7 due at 11:59pm on 03/29/2022
 Topics to Cover: The dummy variable model; unbalanced data; models with dummy and interval variables; weighted least squares; correlated errors
 Homework #8 due at 11:59pm on 04/05/2022
 Topics to Cover: Logistic and Multinomial Regression
 Homework #9 due at 11:59pm on 04/12/2022
 Topics to Cover: Twofactor factorial experiment; randomized block design; randomized blocks with sampling; repeated measures designs
 Homework #10 due at 11:59pm on 04/19/2022
 Topics to Cover: Hypothesis test for a multinomial population; goodness of fit using the 𝜒2 test; contingency tables; loglinear model
 Homework #11 due at 11:59pm on 04/26/2022
 Topics to Cover: One sample; two independent samples; more than two samples; rank correlation; the bootstrap
 Homework #12 due at 11:59pm on 05/03/2022
 Topics to Cover: Time series and their features; stationary processes (ARMA); nonstationary processes (ARIMA)
 Homework #13 due at 11:59pm on 05/10/2022
 Project Reports due at 7 pm on 05/10/2022
WEEKLY GRADES AND STUDENT COMMENTS
Comments from students 
★
★
★
★
★
Interesting assignment. 
★
★
★
★
★
The session was really helpful in terms of explaining the rank variables. The grading consideration by the professor (i.e., backlog issue) was thoughtful and nice. 
★
★
★
★
★
The class was very nice and organized. 
Comments from students 
★
★
★
★
★
The class was nice and informative as always. 
★
★
★
★
★
Class was good enough and thanks for considering the request about the level of the homework. 
★
★
★
★
★
Good lecture. Homework makes us think a lot about why we're doing what has been taught in the lecture. Great help from Vitalii for homework. 
Comments from students 
★
★
★
★
★
The class was very organized and helpful. I have a suggestion for which I might be wrong but I think it will be helpful to all the students: As the end semester is coming and we also have to spend more time on the project milestone 3 if the homework series will become a little bit easier than we can give more time to work on the project milestone. Hope you will consider this. 
★
★
★
★
★
The class was right paced and all the materials were well explained. 
Comments from students 
★
★
★
★
★
Professor's lecture was so good and clear. 
★
★
★
★
★
It was very disappointing after listening to the feedback from other students. I emailed the professor and TAs during the weekend, and I got all of my doubts resolved. There is no question that can be made about the response, delivery of the class, and the behavior of the TAs if one is attending the lecture carefully and regularly. It is one of the best classes I have ever taken, so no per me the professor can neglect those negative reviews, and I hope he will not think anything wrong with the students who are attending lectures sincerely and doing their job on time. Kudos to the professor and both TAs, they are doing their jobs at the best level. 
Comments from students 
★
★
★
★
★
Please provide some more detail description in homework. 
★
★
★
★
★
The class was interesting. But I think today is the deadline for the project milestone. So, that might affect the overall interaction with the class. 
★
★
★
★
★
A few more hints to assignments would be good. 
★
★
★
★
★
Please be more descriptive in assignments and projects. What's taught in class is not enough to do the assignments. We have to wait till Monday to get clarification on many unclear instructions during the TA session. Vitalii helps us with our doubts but often times what he explains is not what is expected by Shaila. We confirmed with Vitalii that IQR is a good outlier removal method for HW 6 but that wasn't accepted by Shaila. 
★
★
★
★
★
The session was very insightful. The knowledge which I got from the theory and practical session related to the general linear model helped me in working on the assignment. 
★
★
★
★
★
In 6th assignment we followed the inputs per Dr. Pavlidis and Vitali. In the case of removing outliers we removed it after arrivign at a strategy after plotting the data. We removed the data after binding which abs difference is greater 16 as they are outliers. But, we were told this is not a good strategy but were never told which strategy to remove outliers is best suitable for the problem. I hope only way to make it clear would be to announce which strategy would be best suitable for current problem. Please correct me if I went wrong. Thanks. 
Comments from students 
★
★
★
★
★
It is really notable and good to get a response over the weekend via mail from the professor. 
★
★
★
★
★
Thanks for adding more descriptions to assignments. 
★
★
★
★
★
The materials covered are pretty intense. However, professor went through the topics at a much reasonable speed. Maybe the methods like BE, FS, etc could be explained in a bit more detail in a more interpretable manner. Other topics were greatly explained. 
Comments from students 
★
★
★
★
★
Excellent! 
Comments from students 
★
★
★
★
★
Need more directions on the assignments. HW 4 was very difficult 
★
★
★
★
★
Unavailability of TA(Ms. Shaila) for doubts regarding Assignments before the deadline causing submissions with uncleared doubts. 
Comments from students 
★
★
★
★
★
Need more time for homeworks 
★
★
★
★
★
The overall lecture is good. TA hours of Vitali is excellent. even he is answering on Sunday also. While this homework series is handling Shaila so Vitali is not going to certain answers as compared to the first homework series. so we asked Shaila for doubts but because of weekends, she might be busy so she didn't give answers to doubts. So it would be very helpful if homework is given by taking into consideration Vitali so students can ask doubts and submit the assignment by the deadline. Assignment complexity is difficult but because of not getting answers from the appropriate TA it becomes very difficult. 
★
★
★
★
★
Very informative reagrding lecture content and clear on expectation of upcoming HW/Projects 
★
★
★
★
★
need more time for assignments if possible. 
★
★
★
★
★
The class is indeed helpful as always. However, I missed some visual plots today. 
★
★
★
★
★
I understood the entire class and I have revised the slides also after the class. But what is to be expected from the homework part is quite confusing for me. I am literally struggling a lot with the interpretation of the tests' results. 
Comments from students 
★
★
★
★
★
Topics are very interesting. 
★
★
★
★
★
Expecting a clear picture in the assignments. 
★
★
★
★
★
Had a better understanding of hypothesis testing. how and what based the data should be tested ( one sided, two sided (alpha)) to determine whether to reject or accept the null and their interpretation. 
Comments from students 
★
★
★
★
★
I think Professor is at the right pace but the R programming part is going a little too fast. The TA is doing a great job though, just slightly slow should be perfect. If there is something in the R like Jupyter Notebook, where the code explanations can be put in English, that might be easier too. 
★
★
★
★
★
Class is very long. 
★
★
★
★
★
Knowledgeable and interesting. 
★
★
★
★
★
it is very difficult to get into home work 
★
★
★
★
★
It was an interesting class. Good to know that you have extended the deadline and given us some more time. Quite excellent and ready to help TA. Great work by Vitalii!! 
Comments from students 
★
★
★
★
★
It is a lot of information in 3 hours. I think a frequent 5 minute break should be given accordingly so we can process all the information and maybe try a thing or two on our own to completely grasp the concept. The homework is really meticulous and a great practice. 
★
★
★
★
★
Please extend the deadlines for submission of the assignments 
★
★
★
★
★
The topics are pretty interesting. I really like the idea of diving into complex topics slowly. 
★
★
★
★
★
I think it would be great if the assignments are also allowed to do in teams with 2 members in each team . 
★
★
★
★
★
Hello, I am content with the way the class is conducted. I like the theory part followed by the practical learning. The assignment was also good, I learned lot new things related to R language and how to do some analysis in it. Thanks, Nilesh 
★
★
★
★
★
The class was good and Lab was very good. This is a new subject for us, the professor should give students some time for assignments and learn R and then submit assignments. else everything is good. 
★
★
★
★
★
The HW is not difficult but a little bit confusing even with the explanations. Could you please make it clearer and easier to understand next time? 
★
★
★
★
★
Hello Prof and TA, I have no complaints in teaching aspects but it's just if we have doubts in assignment during weekends, we have to wait for Monday (the day of deadline) for TA hours. Is there anything we can do about it? Thank you. 
★
★
★
★
★
The content and the code explanation helped a lot for understanding and completing the assignment. 
★
★
★
★
★
The session was quite excellent and useful while doing homework. I have one suggestion, it might be wrong but this is what I felt. Homework is assigned in the class, at that moment we are not prepared to make questions. Once we actually implement the script, that moment we encountered more queries. TA hour is on Monday, so we can clear maximum doubts in that session. The best thing is Vitalii was replying on Teams on Sunday also, but it might be possible he is not available. At that moment we might not get enough time to solve our doubts and fix the error in the script. So, I think the management can adjust this thing either by changing TA hour or by giving some more time after TA hour. 
Comments from students 
★
★
★
★
★
well organized class structure 
★
★
★
★
★
course structure is nice 
★
★
★
★
★
The intro session was good and I liked the immediate responses given by the TAs to the questions raised. But the break in between could be a bit longer in my opinion. 
★
★
★
★
★
The way sessions are conducted that is first starting with some theory and then doing some practical is a great way of understanding the concepts. 
★
★
★
★
★
I think the class is good and informative but I feel it is somewhat fast when it comes to R programming according to me. 
★
★
★
★
★
The first session was very informative and haste. I would request you to explain the topics slowly so i can cope up. 
★
★
★
★
★
Class was good and easy to understand, bit fast paced but was well organized. 
★
★
★
★
★
The lecture was very informative. 
★
★
★
★
★
It was a recall of the previous topics and the introduction to R was great! Looking forward to learning more about it. 
★
★
★
★
★
From the first class, had a good understanding of the fundamentals, syllabus and requirements of this course. 
★
★
★
★
★
The class is very informative and wellorganised. The pace of the class seemed a little fast for me especially the session on R maybe because, I am completely new to R! 
★
★
★
★
★
I felt that the assignment deadlines could be a little extended, and since the class is on Friday, and 3 days for submission will make us work on the weekends, which is okay sometimes but we might me occupied by other things or so. 
★
★
★
★
★
The first class was good, I felt the first class was fastpaced, so maybe in the future session, it would be great if the pace was slowed down a little. 
END OF SEMESTER COMMENTS
★
★
★
★
★
The lectures were easy to understand and carefully explained by the professor including the doubts. The presentations that were saved in the blackboard are used as additional references. The TA explained the concepts and code very well. The marks or assignment evaluations were done very strictly, ie, it is 1 mark less if in one line says 16 and another has 14(by mistake), 1 for report quality. I just hope it would have been much better if it was 0.5 rather than 1. Thanks for all the help. 
★
★
★
★
★
Learnt a lot. Thank to professor and TAs. 
★
★
★
★
★
could have been a little linient on the grading. 
★
★
★
★
★
Just an awesome class, but I felt that course load(hw) was a bit much 
★
★
★
★
★
The class was very informative and challenging. I learned a fair good deal from this course. Excited to be taking Ubiquitous Computer next fall under Dr. Pavlidis! 
★
★
★
★
★
Professor Pavlidis and the TAs Vitalii and Shaila instructed the course in an organzied manner, providing a learning enviornment encouraging participation and discussion. Weekly homeworks allowed for a steady practice of the information gained from lecture. Labs were extremly helpful in completing homework assignments. Professor Pavlidis provided interesting datasets from the medical field and educational research. Although I do wish I particpated in the inperson course provided due to the unique personalities of the professor and TAs the online form of the course was extremely convinient and useful. 
★
★
★
★
★
One of the best classes I have ever taken. Professor: Dr. Ioannis T Pavlidis is the best professor as per me. He is always ready to clear your doubts. Getting responses from him is the easiest task even on weekends which makes him unique among all. If you ask him any questions at any moment, he is very happy and eager to answer everything. His knowledge in the field is beyond the limits. Always listen to the feedback provided by the students. Make necessary changes in class delivery also if students needed (until they are reasonable for sure). Definitely, A LOT to learn from him. TAs: Vitalii and Shaila, are the best at their level. For the practical session, you can always reach out to Vitalii, he is always ready to guide you in the right direction. For any grading, project related queries, Shaila is always ready to give you logically correct feedback. With the help of them, you will definitely learn how to write and organize your code and report. Class quality and workload: Definitely, you will have a huge workload as nothing comes easily. When you have time and want to invest it instead of waste it, surely go for this class. You will enjoy it a lot and will learn a lot. After finishing this class, you will 100% feel that it was useful and all the hard work (ofc smart work is needed 😇😇) that you have done gave you fruitful results. Kudos 🙌🏻🙌🏻🙌🏻 to one of the best teams I have ever worked with. 
KEY INFORMATION
 Prof. Ioannis Pavlidis (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 34 pm on Fridays @ TEAMS
 Vitalii Zhukov (This email address is being protected from spambots. You need JavaScript enabled to view it.) Office Hours: 23 pm on Mondays @ TEAMS
 10% Participation
 50% (5 x 10%) Homework
 40% Project
COURSE OUTLINE
 Topics to Cover: Situating Statistics and Machine Learning in Data Science; observations and variables; types of measurements for variables; distributions; numerical descriptive statistics; exploratory data analysis; bivariate data; data collection
 Topics to Cover: Probability; discrete probability distributions; continuous probability distributions; sampling distributions
 Homework #1 Out
 Topics to Cover: Hypothesis testing; estimation; sample size; assumptionss
 Assignment of Projects
 Topics to Cover: Inferences on the population mean; inferences on a proportion; inferences on the variance of one population; assumptions
 Homework #1 Due
 Homework #2 Out
 Topics to Cover: Inferences on the difference between means using independent samples; inferences on variances; inferences on means for dependent samples; inferences on proportions; assumptions
 Topics to Cover: Analysis of variance; linear model; assumptions; specific comparisons; random models; unequal sample sizes; analysis of means
 Homework #2 Due
 Homework #3 Out
 Project, milestone #1 Due 3/07/2021
 Topics to Cover: The regression model; estimation of parameters; inferences for regression; correlation; regression diagnostics
 Topics to Cover: The multiple regression model; estimation of coefficients; inferential procedures; correlations; special models; multicollinearity; variable selection; detection of outliers
 Topics to Cover: The dummy variable model; unbalanced data; models with dummy and interval variables; weighted least squares; correlated errors
 Homework #3 Due
 Homework #4 Out
 Topics to Cover: Logistic regression
 Project, milestone #2 Due
 Topics to Cover: Factorial experiments
 Topics to Cover: Block design; repeated measures designs
 Homework #4 Due
 Homework #5 Out
 Topics to Cover: One sample; two independent samples; more than two samples; rank correlation; the bootstrap
 Homework #5 Due 5/03/2021
 Project, milestone #3 Due 5/05/2021
 Project Reports Due
WEEKLY GRADES AND STUDENT COMMENTS
★
★
★
★
★
It's actually good. 
★
★
★
★
★
Overall the class was good, the topics felt very clear. For the practice session, I also was having trouble understanding the block design in terms of commands. I didn't quite understand how Rstudio treats the design as a block without having to specify which variable is your blocking variable. 
★
★
★
★
★
Today's lecture was very helpful as always. The high level information combined with the explanation from the professor was helpful to understand the information, and the example using the MPG data helped me visualize the concept. Additionally, the deeper discussion regarding the project was extremely beneficial for me. 
★
★
★
★
★
well understood. 
★
★
★
★
★
Great explanation. 
★
★
★
★
★
The class was very interesting and practical experience is very good. 
★
★
★
★
★
I had trouble following along with todays lecture for some reason. I think maybe it was just the speed you went through the slides, it felt like a lot to take in. The Rstudio session was also rushed, but that was understandable since class ran long with the questions. I'm still a little lost on how to do the last portion of the project, but I'm hopeful that we will go into more detail about it in the next lecture perhaps or maybe it will just become more clear as we progress with lecture and the Rstudio practice. 
★
★
★
★
★
Clarifications of the doubts in the class are really helpful and the class is really interesting. 
★
★
★
★
★
Today's lecture was particularly helpful with the lengthy discussion of the figures for the next part of the project. Additionally, the theory for logistic regression was very helpful to understand the concept 
★
★
★
★
★
Things are coming together now. Looks like the 3rd part of the project will be both challenging and fun. 
★
★
★
★
★
Having difficulties to do homework even after reviewed the lecture video couples of times. 
★
★
★
★
★
The class felt a little rushed, but maybe that's just due to it being short. I appreciated the time you took to go over the third and final part of the project, I hope that we can go into more depth as the class progresses I feel like the second portion of the project wasn't covered enough. The example Rstudio session was well done. Going in I wasn't sure what we were doing and by the end I understood enough to do the inclass exercise with some difficulty. 
★
★
★
★
★
no comments 
★
★
★
★
★
The whole lecture was very interesting and also easy to understand. 
★
★
★
★
★
It was good. 
★
★
★
★
★
Overall it was a good lecture, but it was somewhat hard to follow. I couldn't make some of the connections you were making and that may have been helped with examples or maybe not going quite as quickly through the material. 
★
★
★
★
★
Understood dummy variables. Need to review the second half of lecture again. Class exercise reinforced some of the concepts. 
★
★
★
★
★
class was interesting and class followed by relative assignments made me to revise the previous class which helped to have a clear idea in those topics 
★
★
★
★
★
Lecture was good and helpful. 
★
★
★
★
★
Everthing's clear and easy to understand and implement. 
★
★
★
★
★
This lecture was helpful for me to understand the underlying theory of a couple different approaches for linear models, but I had trouble grasping the difference between linear regression and linear modeling in terms of theory. However, the distinction that the application for different factor levels was helpful. 
★
★
★
★
★
Vitali's demo was a little rushed and it was a tad hard to relate it to the exercise 
★
★
★
★
★
I really like the way the class is going including the practical implementation of the topics covered and the time you give at the end of the lecture for the exercise try out. It seems really interesting. 
★
★
★
★
★
It was something new to me, and I think both professor and Vitalii did a great job on the topic! 
★
★
★
★
★
I'm very comfortable with the the lecture and also the practical session provided by the TA. It is very clear and I could clear all my doubts during the session too. 
★
★
★
★
★
I got a little lost at the end with the C(p) portion, but overall it was a great class. I appreciated the time spent talking about the second part of the project and it clarified a few things I was stuck on. 
★
★
★
★
★
Everything is clear and easy to get. 
★
★
★
★
★
Started understanding multivariate regression and ability to deduce response variable based on their combination. 
Comments from students [show / hide] 
★
★
★
★
★
Thanks for the examples and great explanation by vitalli. 
★
★
★
★
★
The lecture information was helpful and provided enough detail to help me understand the theory behind linear regression without becoming too complicated. The portion of class dedicated to discussion of the second project milestone was also highly appreciated. 
★
★
★
★
★
The class is pretty interesting with parallel practical work. 
★
★
★
★
★
This was probably the best class to date. I appreciated you going over the project first and taking the time to explain what the plots were. The practice was particularly helpful for the exercise today and sometimes it feels like this is a learn to use Rstudio course more than a statistics course. For example, with the project, which felt very much like a test of how well I could use Rstudio, but today's exercise didn't feel that way. I also appreciated the clear expectations for the second part of the project, so thank you. 
★
★
★
★
★
At the end of the class, I'm really happy as I have information about the ANOVA test and how to implement it on some data. For me, it's a valuable class. I appreciate it. 
Comments from students [show / hide] 
★
★
★
★
★
Thank you for the extension and the hints provided during the class. Is there a chance we can get the answers/explanation for homework 1? 
★
★
★
★
★
Classes are so far so good. 
★
★
★
★
★
I had some confusion based on the first project and what was expected from us. The class mostly cleared it up, but having the expectations explicitly outlined in the syllabus or on blackboard would've been helpful, maybe for future classes or for the second assignment if it's not too much trouble. One other thing, in a previous lecture you stated that the third plot was the intersection of the first and second (the lecture on Feb. 05), which made the plot clarification given today confusing since I don't believe that is actually the intersection (I could be mistaken). In any case, it would've been helpful to clarify with more time to fix the plot, again probably something for future classes. Overall the lecture today was good and I learned a lot, I really appreciated detail you both went into during the Rstudio portion of the class. Thank you! 
★
★
★
★
★
I think I understood this one the best. Good class! 
★
★
★
★
★
At the end of the class, I'm really happy as I have information about the ANOVA test and how to implement it on some data. For me, it's a valuable class. I appreciate it. 
Comments from students [show / hide] 
★
★
★
★
★
Overall I think today was a great class. I was a little lost on the discussion about the project and that has me worried that I am somehow behind. But overall I think the length of the class has been better the last two meetings and really appreciate the time in class to do the miniassignment. 
★
★
★
★
★
It'll be helpful if you could explain more about the gephy for the upcoming project. 
★
★
★
★
★
very helpful as ur giving the hoemwork realted to the previous class 
★
★
★
★
★
The class is very interesting with a parallel practical and handson approach. 
★
★
★
★
★
The pace of this lecture was good for me and the information provided for gephi and for R for the project tasks was very concise and helpful. 
Comments from students [show / hide] 
★
★
★
★
★
Great lecture and tutorial. Would it be possible to see the answers for the exercises posted after the due date? I think it would help with the homework a lot. 
★
★
★
★
★
As I learn the theory of statistical methods and implement them on R studio , I'm getting more confident to use R programming. 
★
★
★
★
★
Today's lecture was very helpful to provide a highlevel view of common test statistics and how they are computed and applied. Additionally, the information about the project was helpful for me as well as the R tutorial. 
★
★
★
★
★
This class was much simpler to follow than previous portions 
★
★
★
★
★
Good lecture. 
★
★
★
★
★
This class was particularly good. I liked that there was time at the end to work on the weekly assignment so we could ask questions if needed. 
Comments from students [show / hide] 
★
★
★
★
★
Today's class was very usefull. Dr. Pavlidis's lecture moves at a reasonable pace (for me) and tends to focus on important, high level information without going into too much detail about the underlying statistics/mathematics. Additionally, the R tutorials continue to be helpful for me and I find they are paced well 
★
★
★
★
★
It's been a good to learn more about the subject, as i get to learn new things and get to know more about the subject. Explaining more in deep would be a better like explaining with more examples would be easy to understand, other than that everything 's fine. thank you. 
★
★
★
★
★
Like the material being covered, the real world data analysis project, and practical help being provided by Vitali! I am coming back to school after a while and am a bit rusty with Math. Will covering the stats course at Khan Academy give me enough background to get a solid foundation for this course? 
★
★
★
★
★
Informative! 
★
★
★
★
★
The class would be interesting if the professor tells us more examples about the theoretical concepts like drawing things on the screen to give us more clarity. 
★
★
★
★
★
The class was nice and the professor explained everything very nicely. The TA also demonstrate materials nicely/ 
★
★
★
★
★
I would enjoy it if more in person classes are organised and a parallel on hands experience would do great in the meetings. 
★
★
★
★
★
Please reduce the Pace of your explanation and try to conduct atleast a class inperson in a month to revise all the topics coververed so that we can have more interaction. 
★
★
★
★
★
Much better today, but the practical portion was still far too quick. I couldn't keep up and it was hard to follow, please try to slow down with the examples, I like to try to understand how the code works and its hard to do that when you skip ahead. 
Comments from students [show / hide] 
★
★
★
★
★
Kindly, slow down the pace and use as many as examples to demonstrate in practical part of the class. 
★
★
★
★
★
The lab portion of the class was extremely rushed so it was hard to get anything useful from it. 
★
★
★
★
★
Please provide examples. 
★
★
★
★
★
The duration of the class is long, it will become less interesting as time passes. It would be better if the duration of the class is one and a halfhour each on two weekdays. 
★
★
★
★
★
I like the way you teach in class. But the pace of the practical class is fast. 
★
★
★
★
★
Thank you for your lecture. 
★
★
★
★
★
The R programming part was very fast. It would be better to cover the basics of each part slowly than to go fast through the whole code. 
Comments from students [show / hide] 
★
★
★
★
★
Class is helpful for both theory and code sections. Thank you! 
★
★
★
★
★
The exercises you worked through at times went too quickly for me, but outside of that the class was very useful. The class itself feels a bit long to be honest, it would've been nice to have two days at one and a half hours or two hours, three just seems like a lot for one time in my opinion, but that could just be me. I really did learn a lot and it was very interesting and I'm excited to learn more. 
★
★
★
★
★
Bit of audio problem, but otherwise great lecture. Liked the practical aspect and hands on coding. 
★
★
★
★
★
For the break, I would suggest objective time frames so we know how much time we have. Something like "we will take a break until [insert exact time]" would be helpful. Other than that, the lecture was very helpful 
★
★
★
★
★
The lecture was informative and a good review on some material I learned in previous courses. 
★
★
★
★
★
I've had no previous experience in R so this was a very good and useful introduction for me. 
★
★
★
★
★
First lecture was very clear and concise with its goals and content. The R introduction was also extremely helpful and structured very well. 
KEY INFORMATION
Class Meetings
Friday 4:00 – 7:00 pm @ Teams
Course Instructor
Prof. Ioannis Pavlidis (This email address is being protected from spambots. You need JavaScript enabled to view it..
Office hours: 34 pm on Fridays (@ Teams)
Course TA
Mohammed Emtiaz Ahmed (This email address is being protected from spambots. You need JavaScript enabled to view it..
Office hours: 122 pm on Thursdays (@ Teams)
Course Description  COSC 6323
The course covers statistical methods in human and technology studies or experiments, from where the bulk of scientific and engineering data originate. The course starts by situating statistics in the context of data science. Special emphasis is placed on the relationship of statistics to machine learning. Then, instruction proceeds in a stepwise manner building the student’s background in the statistical tools of the trade, without which an MS thesis or PhD dissertation cannot be complete. The course culminates with sessions on experimental design, one of the cornerstones of modern data science.
Although the introduction and methodological sections of scientific papers differ from discipline to discipline (e.g., algorithms vs. assays), the results sections of papers should conform to a universal pattern, according to currently accepted best practices. The produced data should be derived according to appropriate study/experimental designs and should be subjected to relevant statistical tests. There is no such thing as statistics for computer scientists or statistics for biologists; statistics is the same for everybody. However, certain disciplines tend to use some tools more than others, and instruction needs to be tailored according to students’ educational backgrounds. In computer science in particular, adopting statistical analysis of experimental results has been slow. This has changed the last few years and several computing disciplines have already adopted statistical methods as the analytic standard, while others are bound to follow sooner or later. Among the computer science communities that are at the forefront of this movement are the HumanComputer Interaction and Computer Vision communities. The Statistical Methods course aims to cover this need and is paced taking into account the typical background of graduate students in computer science. It is very practical in its orientation (no proofs), emphasizing the understanding of concepts and the ability to choose the right design or apply the right test.
The first part of the course starts with the delineation between continuous and discrete variables and the enormous implication that this carries for the selection of tests. Then, it proceeds with the description of distributions, probabilities and error types that are fundamental to the construction of the ttests, ANOVA tests, and nonparametric tests. Next, the course visits regression in its various forms, completing the coverage of significance and association tests used in almost all scientific papers. Emphasis is placed on multiple regression and linear modeling – a powerful and elegant method to examine the effect of multiple factors in a research problem; it is heavily used nowadays in MS and PhD research. The treatment of symbolic data and the tools of last resort, that is, nonparametric methods complete the course’s first part.
The course’s second part covers various experimental designs. Before students start analyzing data, they need to know according to which principle to collect these data in order to address their hypothesis; for this, they need to pick the right experimental design. Even perfect analysis will not save the day if the investigator picked the wrong experimental design (i.e., garbage in – garbage out). Hence, at the end of the course’s second part students acquire 30,000 feet view of the scientific and engineering process, solidifying their ability to design, collect, and test data.
The course has four homework assignments to reinforce the understanding of the concepts and methods. In the place of a final exam, the course has a semester longproject, where a problem is defined for the class, and then each group of students is required to come up with a study design, collect/quality control data, and perform tests, putting everything in the form of a term paper. The homeworks are individual assignments while the project is a group assignment; each project group typically consists of 23 students.
The students need to know R and R Studio in order to process and plot the data. R is becoming one of the most useful tools for computer scientists in the data analytics business. The instructors provide the students with online educational material and organize an R tutorial class. Importantly, the last hour of each threehour class session is devoted to R programming, where the students code the theoretical principles covered earlier in the session.
Gradebook
10% Participation
4 x 12.5 % Homework
40% Project
COURSE OUTLINE
Course Outline
Lesson 1: Data, Statistics, and Data Science 1/17/2020
Situating Statistics and Machine Learning in Data Science; observations and variables; types of measurements for variables; distributions; numerical descriptive statistics; exploratory data analysis; bivariate data; data collection
Lesson 2: Probabilities and Sampling Distributions 1/24/2020
Probability; discrete probability distributions; continuous probability distributions; sampling distributions
Homework #1 Out
Lesson 3: Principles of Inference 1/31/2020
Hypothesis testing; estimation; sample size; assumptions
Assignment of Projects
Lesson 4: Inferences on a Single Population 2/7/2020
Inferences on the population mean; inferences on a proportion; inferences on the variance of one population; assumptions
Homework #1 Due on 2/7/2020
Homework #2 Out on 2/7/2020
Lesson 5: Inferences for Two Populations 2/14/2020
Inferences on the difference between means using independent samples; inferences on variances; inferences on means for dependent samples; inferences on proportions; assumptions
Lesson 6: Inferences for Two or More Means 2/21/2020
Analysis of variance; linear model; assumptions; specific comparisons; random models; unequal sample sizes; analysis of means
Homework #2 Due on 2/21/2020
Homework #3 Out on 2/21/2020
Lesson 7: Linear Regression 3/6/2020
The regression model; estimation of parameters; inferences for regression; correlation; regression diagnostics
Lesson 8: Multiple Regression 3/27/2020
The multiple regression model; estimation of coefficients; inferential procedures; correlations; special models; multicollinearity; variable selection; detection of outliers
Lesson 9: Linear Models 4/3/2020
The dummy variable model; unbalanced data; models with dummy and interval variables; weighted least squares; correlated errors
Homework #3 Due on 4/3/2020
Homework #4 Out on 4/3/2020
Lesson 10: Categorical Data 4/10/2020
Hypothesis test for a multinomial population; goodness of fit; contingency tables; loglinear model
Lesson 11: Nonparametric Methods 4/17/2020
One sample; two independent samples; more than two samples; rank correlation; the bootstrap
Lesson 12: Experimental Designs 4/24/2020
Randomized designs; paired comparison designs; randomized complete block designs; Latin square designs; GrecoLatin square designs; balanced incomplete block designs; twofactor factorial designs; general factorial designs
Homework #4 Due on 4/27/2020
Project Reports Due on 4/29/2020
References
[1] Horton, N.J. and Kleinman, K. Using R and RStudio for Data Management, Statistical Analysis, and Graphics. CRC Press, 2015
[2] Freund, R. J., W. J. Wilson, and D. L. Mohr. Statistical Methods. 2010.
[3] Montgomery, Douglas, C. Design and Analysis of Experiments. Ninth Edition. John Wiley & Sons, 2017.