Uncertain as to how to set off on my new voyage, I decided to reflect on what I've learned from Python and the other tools necessary to build the knowledge I need for deep learning. Which brings me to this post. The first post and the starting point of it all.
My urge to learn this new tool is because I don't relish collecting data. You see. So much time is lost going through hundreds of documents, gathering details of each one, and putting them on an extensive spreadsheet. A machine could do that. The question is, and here it goes: how do I make the machine - let's call it 'machine' - reason in the same way I would when I am collecting data?
And once the data is found, how do I tell it to move that data onto a spreadsheet?
So I decided to make a brief layout of what I am on the lookout for. Have I given It much thought? Probably not enough, but this is how it begins, these are the baby steps one has to take to get moving.
What data do I want?
- All of it. I want to be able to obtain whatever I need with a few lines of code.
Where is the data?
- On a database
- Word documents
- PDFs
What do I want Python to do?
- Convert all of the data into text.
- Find the strings of text I am interested in.
- Make sure those strings of text are within a determined context.
- Copy the string with details of the context in which it was found into a new dataset.
Why is this so important?
Because I shouldn't be doing this. A machine should.
Why Deep Learning?
DL is essential to interpreting the context from which the string of text is copied from. By understanding this, it will behave and decide when the string of text is relevant for my data collection.
What I am writing may not make much sense at the moment. But as time goes by I will have a better idea of my needs and the capabilities of Python. Time is not on my side, and will not be for good while, so I will learn in very small chunks and keep a log here of what I've learnt.
Hopefully this will serve others decide to pursue or not, a journey into self taught deep learning.