Sunday, March 31, 2024

Make Everyone Smile

Hey there! Just wanted to let you know that today is officially National 'Make Everyone Smile' Day! So, consider yourself officially tasked with spreading joy and laughter wherever you go. Oh, and if you happen to fall for any pranks along the way, just remember it's all in good fun! Happy April Fool's Day!

Saturday, March 5, 2022

New MS project ideas

 

  1. The car tells you what it needs http://news.mit.edu/2017/software-let-your-car-tell-you-what-it-needs-1026
  2. Similarly in an Industrial setting, an expensive machine tells you when it needs service
  3. Wristbands detect stress signals http://news.mit.edu/2016/empatica-wristband-detects-alerts-seizures-monitors-stress-0309
  4. www.tanium.com endpoint management
  5. Build a tool like AHA Ideas: https://www.aha.io/pricing?product=Ideas
  6. Develop a data pipeline for Climate Intelligence (get inspiration from https://cervest.earth/
  7. Morale and culture-building app for remote teams get inspired from https://culture.justdisco.com/
  8. cross-selling services for a specific business. Get inspired from https://swogo.com/ Fir example, millions of sellers on eBay or Amazon sell just 1 or 2 items, the service will enable them for cross-sell other products to the same customers and save customer acquisition costs
  9. Subscription business creation platforms are emerging, see this for zym/athelitc trainers https://playbookapp.io/

Wednesday, February 10, 2021

Spring 2021 - MS CMPE272 Team Project Ideas

While COVID taught us a great deal of resilience, patience, empathy and generosity, it also gave us opportunity to think hard and introspect. What's important in our lives. How do we use our technical skills and everything else to solve problems with greater good. 

Data sources: https://usafacts.org/data/

My Grad students always do projects using cool modern technology such as AI and Machine Learning. Past projects are available here:

Spring 2017: https://github.com/SJSU272LabS17

Fall 2017: https://github.com/SJSU272LabF17

Spring 2018: https://github.com/SJSU272LabSP18

Fall 2018: https://github.com/SJSU272LabF18

Spring 2019: https://github.com/SJSU272Spring2019

Fall 2019: https://github.com/SJSUFall2019-CMPE272

Spring 2020: https://github.com/SJSUSpring2020-cmpe272


This semester, they will do even better. Here are some ideas I am sharing with them:

Category: AI in Software Development

Idea#1: Apply ML into Fuzzing to improve software reliability

Use ML to avoid testing every possible input by training on "interesting inputs" making testcases to find more suspect outputs. Read more about Fuzzing here
This area of research is also called as "Big Code" see DARPA announcement 
JSNice is another good implementation of Statistical renaming, type inference and deobfuscation technique on Javascript code. 

Idea#2: New implementation of the Probabilistic programming 

Probabilistic programming or (PPL) is a tool for statistical modeling. The idea is to borrow lessons from the world of programming languages and apply them to the problems of designing and using statistical models. Learn more here. Also see a WebPPL interactive  implementation here

Idea#3: Use AI to measure and improve Microservice performance

Microservices architecture is common ways to deploy containerized software in the cloud. However, increasing number of microservices increases management complexity due to distributed architecture. Network latency and load balancing are other challenges. See this use case for inspiration. 

Category: Data Privacy & Governance

Idea#4: Data Privacy in the Decentralized AI architecture

Whether its a race to get COVID vaccine developed or race to open businesses safely, its all about data sharing between government and private entities without compromising privacy of our citizen. There are technologies such as Secured Multi-party computation (sMPC), differential privacy and other  techniques that enable privacy preserved data sharing. Take an example of various shopping Malls in a city sharing data while not disclosing any competitive information. A centralized analysis of federated data will allow decision makers to take safety measures, yet facilitating shopping experience. 

Idea#5: Create domain specific business glossary using NLP

Vertical taxonomies such as an e-commerce taxonomy for retailer, telecom taxonomy for Telecom. Businesses will be able to bring their domain specific private documents and the NLP based AI system will create a well formed taxonomies ingestible into a common catalog such as Google data catalog or Alation Business Glossary

Category: Social Good

Idea#6: Lucy keeps our seniors happy and engaged

#1 problems our seniors (over 80) face today is loneliness and isolation. According to the U.S. Census Bureau 11 million, or 28% of people aged 65 and older, lived alone at the time of the census. As people get older, their likelihood of living alone only increases. Additionally, more and more older adults do not have children, reports the AARP, and that means fewer family members to provide company and care as those adults become seniors.
1. Senior isolation increases the risk of mortality.
2. Feelings of loneliness can negatively affect both physical and mental health.
3. Perceived loneliness contributes to cognitive decline and risk of dementia.
4. Social isolation makes seniors more vulnerable to elder abuse.
5. Loneliness in seniors is a major risk factor for depression.
6. Socially isolated seniors are more pessimistic about the future.

Lucy is a skill developed for Amazon Alexa device echo dot which sits in senior’s room.

A group of student volunteers write personal emails to our seniors . Lucy reads those emails to seniors as they come or at certain time.


Idea#7: Get inspiration from DataKind Projects


https://www.datakind.org/projects


Idea#8: Using Divrsitykids datasets, create a dashboard and use NLG to generate a citizen friendly report


Datasets: https://data.diversitydatakids.org/dataset


Idea#8: Using Divrsitykids datasets, create a dashboard and use NLG to generate a citizen friendly report


Datasets: https://data.diversitydatakids.org/dataset

NLG: https://rosaenlg.org/rosaenlg/1.10.1/index.html


Idea#9: Street vendors support network

Bringing street vendors in an organized group has several advantages:

Vendors will get much needed financial help and support to bootstrap their business. 
Govt will have much desired success in getting them on epayments and cashless economy.
this network will be able to uplift them and help them move into high value quadrant. For example, a street food hawker graduates to become a food truck owner. A vegetable seller eventually becomes a “Fresh cut vegies” supplier to apartments. 
This network can be connected to micro lenders and other entrepreneurs such as organic farmers.





Category: Cyber Security

Idea#10: Zero Trust security 
  • Zero trust vs “trust and verify”
  • Hyperlocalization in business will drive next gen endpoint security for data and application access
  • Borderless security practices for multi-national companies and workforce
  • Remote Work-employee experience
Category: Immigration

Idea#11: Look at the questions to ask and available data sets here ( https://usafacts.org/issues/immigration/and create a dashboard for legal immigrants that can help answer questions such as: 
what are the chances of a H1B visa being denied?
What will be the impact on a certain industry if H1B Visa is eliminated?
What will be the benefit to US economy if US implements time bond citizenship for immigrant workers? 

Category: Crime & Justice

Idea#12 Look at available data sets and create a dashboard for US taxpayers which can answer important questions such as:
  • whats the burden on tax payers for prisoners in private prison vs state prison?
  • impact of pending court cases on economy and taxpayers?, correlation with specific types of crime?
  • correlation between certain types of firearms and crime?

Idea#13: Justice on wheels

Access to Legal services in India is very limited, especially in rural areas. In many cases, long running justice process and expensive legal help deter people even to seek justice. A Justice on wheels app(with lawyers network) that provides mobile (at your doorstep ) service for:

Affidavit and stamp papers related work
Will and trust
Legal rights education 
Consumer protection related services and others

This “On wheels” idea can be applied to Tutoring, Books and stationary selling.

“Accounting on wheels” can be another opportunity as there are tons of BCOM boys and girls in India looking for work. The company will train them on business and financial accounting


More coming..






Monday, December 14, 2020

Story of a food

 Food is what brings us together, if one thing this pandemic has taught us is that how important it is to cook at home with our family, eat healthy and nutritious food and exercise to boost immunity and stay sane during this time. I love cooking at home, I am a big fan of cooking with healthy grains and vegetables even if they come from unknown territories. Recently I started a blog to share the untold stories of some of the superfoods which I think, folks don't know much about. In this series, I have published stories of Ginger, Beetroots and Quinoa. Please subscribe the Substack blog post so you get weekly stories directly into your Inbox. Let me know your thoughts and any recipes that you want to share. 

https://foodstory.substack.com/




Wednesday, October 28, 2020

ML 101 - Setup Jupyter notebook on your laptop

 This instruction is for Mac, please look for similar instruction to install on Windows/Linux.

First thing first: Install brew if you don't have it:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"


Setup PATH:

export PATH="/usr/local/opt/python/libexec/bin:$PATH"


Install latest Python3 and necessary packages:


brew install python

pip3 install -U pip setuptools wheel


Install Jupyter:


pip3 install -U jupyter jupyter-client


Install datascience packages:


pip3 install -U numpy pandas scikit-learn matplotlib


Setup a working directoryand start Jupyter server:


mkdir ~/work

jupyter-notebook --notebook-dir=~/work


This should start Jupyter on your localhost: http://localhost:8888/


Tuesday, March 24, 2020

CMPE272 Spring 2020 Project Ideas

During this unprecedented time, while we all are sheltered in-place due to COVID-19 its in our best interests to learn new technology and keep ourselves busy with learning. Students can use IBM Watson Studio in the IBM Cloud for free. https://cloud.ibm.com/catalog/services/watson-studio
They can also use Analytics Engine (Spark as a Service)  https://cloud.ibm.com/catalog/services/analytics-engine

Using these services, they can solve real world problems (examples below):

1.     Predict COVID-19 spread using kaggle data: https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
2.     Use NHTSA data to allow user to query fatalities based on some keyword http://www.nhtsa.gov/FARS
3.    Use airquality data http://www3.epa.gov/airdata/ad_data.html and send alert based on my zipcode 
4.     Earthquake detection using USGS data http://earthquake.usgs.gov/earthquakes/search/
5.     Predict employee  retention (you can use glassdoor data for this prototype) 
6.     Fraud detection in user reviews (use this data: https://snap.stanford.edu/data/web-Amazon.html
8.     Use FAA bird strike data   to predict most dangerous airports http://wildlife.faa.gov/database.asp
9.     Use home energy usage datasets http://redd.csail.mit.edu/  as training set to find out how your own home energy usage levels up. Hint: user will enter his own energy usage and app will provide comparative analysis. 
10. Use baby names datasets to come up on some cool analytics : https://www.ssa.gov/oact/babynames/limits.html
11. Use NIST vulnerability datasets to predict possible DDoS attack https://nvd.nist.gov/download.cfm
12. Use water and sanitation data http://www.data.unicef.org/water-sanitation/sanitation.html (Links to an external site.) to bring analytics such as which countries is worst / which one is improving.. 
13.  Use children mortality data http://www.unicef.org/statistics/index_countrystats.html  to figure out what are the most important causes for under 5 yr old kids die.. 
14. Use 2012 presidential donation datasets to find out who donated to presidential candidates most and if there is any correlation http://www.fec.gov/finance/2012matching/2012matching.shtml
15. Use Music reviews data https://nijianmo.github.io/amazon/index.html (contact the owner for data link..) and predict popular song
17. Use interesting Genome and proteins data sets from http://www.ncbi.nlm.nih.gov/home/download.shtml (Links to an external site.) and use spark to calculate interesting facts (Hint: use clinVar datasets to find out all gene types related to conditions: "Breast-Ovarian cancer" 
18. Use housing affordability datasets with Spark to come up on good analytics - which city and zip code has good overall job opportunity and housing affordability for 30-40 year old http://catalog.data.gov/dataset/housing-affordability-data-system-hads (Links to an external site.) 
19. Use farmers market location datasets and spark to generate some interesting analytics http://catalog.data.gov/dataset/farmers-markets-geographic-data (Links to an external site.)
20. Use Govt real estate asset datasets: http://catalog.data.gov/dataset/real-estate-across-the-united-states-rexus-inventory-buildingand spark to come up on some cool analytics such as how much money govt is spending on maintaining useless assets etc.. 
21. Use death cause datasets http://catalog.data.gov/dataset/leading-causes-of-death-by-zip-code-1999-2013 (Links to an external site.) and spark to answer health related questions 
23. Use NY open data of your choice https://nycopendata.socrata.com (Links to an external site.) and bring some good analytics 
24. Use Austin Restaurant inspection report data and spark to get answer to critical consumer questions such as which cuisine or which area rest has more violations in past year etc... https://data.austintexas.gov/dataset/Restaurant-Inspection-Scores/ecmv-9xxi (Links to an external site.) (Links to an external site.)

Sunday, May 20, 2018

Spring 2018 Technology Showcase at San Jose State University

Every semester hundreds of students graduate in Computer engineering from SJSU Charles Davidson College of Engineering. Visit my Linkedin post  for this semester's 295B projects:
https://www.linkedin.com/pulse/technology-showcase-day-rakesh-ranjan/ 

Make Everyone Smile

Hey there! Just wanted to let you know that today is officially National 'Make Everyone Smile' Day! So, consider yourself officially...