Right now, machine learning is a big thing (from Google's self driving car to disease prediction!) And in this field, DATA is everything. If you dont have a good quality data, then your accuracy is gonna be a mess. So it's customary to preprocess your data. But there are so many preprocessing techniques??? Which one/s should I use? There are already a lot of tutorials online, but I'd like to share what I've been doing in my current project.
Preprocessing Techniques
Original and Downsized(Top) vs
Cropped and Downsized(Bottom)
It all depends on your data. The data that I am currently working on are images from multiple databases. Imagine the heterogeneity of the data! So here are some preprocesing methods that I can do:
- Normalization > I adjust the pixel values to a range of 0 to 1. I do this to avoid the values from blowing up.
- Downsizing the image > Large images took too long to load, so downsizing the images can speed up the computation.
- Cropping the image > Sometimes there are unnecessary portions in the images, I can just crop on the area that I want to focus on
Data Augmentation
Another thing in machine learning, you must have a BIG DATA! The bigger the better. This is because the data is used to learn. A technique that I can do to increase the amount of data is data augmentation. For images I do the following:
- Random adjustment of the brightness
- Random adjustment of the contrastt
- Random rotation
- Random flipping
Here are sample augmentations featuring a sleeping Loki.
Python Libraries
I wrote the scripts that I used in the images, but here are some libraries that you can use:- OpenCV
- Pillow
- Scikit-image
This pipeline produces a total of 1000 images which are cropped at the center,
rotated and zoomed according to a certain probability.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Good one from this i can see you use mostly python. Can you one do this on R too
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Sure. I can do R too :-)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Congratulations @ririgi! You have completed some achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of upvotes
Award for the number of upvotes received
Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here
If you no longer want to receive notifications, reply to this comment with the word
STOP
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit