Interest in Data Science |
Post Reply |
Author | ||
schooter
Organizer Joined: Feb/22/2010 Status: Offline Points: 224 |
Post Options
Thanks(0)
Posted: Apr/28/2016 at 8:37pm |
|
Hi Everyone,
Got some potential interest or at least some people curious about "Big Data" and how SAE Baja could be used to create opportunities in data science. This is a focus area that many industries are gathering around. As such, the ability to work with large data sets, create value stories and direct action is becoming a highly valued skill. I'm thinking maybe make timing data available for students to perform analytics on and/or have my employer sponsor an award for the team that best utilizes data. Or even just simply have an interactive data visualization embedded on a website for students to play with. I'd like to gather some thoughts on the interest of students. What's your thoughts? Ideas? Please speak up!
|
||
Chase Schuette
https://www.linkedin.com/in/chaseschuette/ |
||
Brad SXT
Welding Master Joined: Dec/02/2012 Location: The Motor City Status: Offline Points: 180 |
Post Options
Thanks(0)
|
|
I think this is a good idea since my employer is hiring quite a few people in this field.
It's clear to me how this would benefit individual team members by making them more marketable to employers. That being said I'm not familiar enough with the "big data" scene and in the five minutes or so I thought about it I don't have the slightest clue on how it would benefit teams during the actual competition. Do you have any ideas on what teams could accomplish with this data to make them more competitive? |
||
UofL Baja Alumni
Midnight Mayhem track design Sometimes Livestream Announcer Guy Design judge every once in awhile |
||
Bettner12
Welding Master Joined: Jan/24/2011 Location: Hanover, Pa Status: Offline Points: 239 |
Post Options
Thanks(0)
|
|
I think many teams would appreciate the ability to look back on their lap times, length of refueling or pits, and length of time to be towed back. The ability to statistically analyze the data and possibly draw some correlations and conclusions from results of certain dynamic events to lap times in the endurance race might be useful as well. I know more and more teams are using data acquisition systems, and there might be some additional information to be gained from correlating some of the data collected there with the timing and scoring from the race. I know there are probably teams that would love to have the raw times for the events available to access for analysis after the competitions, but it would be too big of a data collection event for one team to undertake themselves, but we have all that info from the transponders, it seems like a waste to not be able to use that data to influence decisions for the next year. The more we are able to quantify our designs, and design changes, I feel like the better the engineering SOME of the teams could do each year. I will never say no to more data -Clinton
|
||
Penn College of Technology Alumni
Maryland 2018 Organizer |
||
schooter
Organizer Joined: Feb/22/2010 Status: Offline Points: 224 |
Post Options
Thanks(0)
|
|
Thanks for the replies
Brad SXT, I certainly do have ideas of what teams would accomplish; I want to leave that open to participants to decide what they will do with the data. The way the data would be analyzed is where the actual critical thinking and analytical skills off each team would come into affect. Bettner12 has ideas similar to what I'm thinking though. I'm looking at potential use for making real time decisions during competition (particularly the enduro) and also potentially for future design reference. As I was saying. Making the data available is one idea. More simply providing a interactive or even static visualization embedded on a website is another idea. This is all very preliminary. I am just receiving some significant interest from people that control some budgets with my employer.
|
||
Chase Schuette
https://www.linkedin.com/in/chaseschuette/ |
||
schooter
Organizer Joined: Feb/22/2010 Status: Offline Points: 224 |
Post Options
Thanks(0)
|
|
Overcoming the vastness of a dataset is a common challenge of most datasets. I would encourage you to keep thinking on this. Check out Tableau software and download the free trial if you're interested. http://www.tableau.com/products/trial
Edited by schooter - May/02/2016 at 11:04pm |
||
Chase Schuette
https://www.linkedin.com/in/chaseschuette/ |
||
Bettner12
Welding Master Joined: Jan/24/2011 Location: Hanover, Pa Status: Offline Points: 239 |
Post Options
Thanks(0)
|
|
I didn't mean the actual handling of the data set, just the challenge of a single team timing a couple thousand laps. My experience has been with minitab so far, but tableau looks a little "newer" and easier to generate useful graphics. I know just looking at the live scoring data from Tennessee after the race, I've already had some pretty significant realizations on some events and what their total impact to lap times ends up being.
|
||
Penn College of Technology Alumni
Maryland 2018 Organizer |
||
schooter
Organizer Joined: Feb/22/2010 Status: Offline Points: 224 |
Post Options
Thanks(0)
|
|
I agree; a single team timing a couple thousand laps would be unreasonable. If a data set was made available for students it would have the timing data. Tableau is an analytics software that is able to handle large data sets. For instance I'm working with a data set right now that has over 500,000 lines of data with a few dozen columns. I've even seen a data set with over 300 million lines before and Tableau worked with it very reasonably. Tableau is a "BIG Data" tool. It's been a long time since I've played with Minitab. I don't believe it has any capability of working with large data sets. Though I would need to confirm that.
Edited by schooter - May/03/2016 at 10:18am |
||
Chase Schuette
https://www.linkedin.com/in/chaseschuette/ |
||
rushvr
Double Secret Probation Joined: May/04/2016 Status: Offline Points: 2 |
Post Options
Thanks(0)
|
|
Data science employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science, including signal processing, probability models, machine learning, statistical learning, data mining, database, data engineering, pattern recognition and learning, visualization, predictive analytics, uncertainty modeling, data warehousing, data compression, computer programming, artificial intelligence, and high performance computing. Methods that scale to big data are of particular interest in data science, although the discipline is not generally considered to be restricted to such big data, and big data solutions are often focused on organizing and preprocessing the data instead of analysis. The development of machine learning has enhanced the growth and importance of data science. %3cspan%20style= - http://www.fathersday-2014.net "> http://www.fathersday-2014.net .
Edited by rushvr - May/04/2016 at 1:53am |
||
schooter
Organizer Joined: Feb/22/2010 Status: Offline Points: 224 |
Post Options
Thanks(0)
|
|
SPAM....and about every key phrase except IoT (internet of things) and a few other buzz words. |
||
Chase Schuette
https://www.linkedin.com/in/chaseschuette/ |
||
schooter
Organizer Joined: Feb/22/2010 Status: Offline Points: 224 |
Post Options
Thanks(0)
|
|
I'm in talks about a possible hackathon for next season. Thoughts?
|
||
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |