That was an even more epic fail compare to the weather data competition host seven months back.
I am about 19 hours away from the deadline but with me tomorrow full time at this deep learning conference, I know this is not going well so I might as well get a head start on this post mortem.
Yet again, I failed to submit to Kaggle competition. Fool me once, shame on me. Fool me twice… damn. I am a fucking idiot.
Things I should have done better:
- I completed all my training locally on my own GPUs for accessibility reasons. Also I organized my data/folders/dependency scripts in my own particular fashion (not yet as a python module package published to pip… hence also not readily callable from Jupyter)
- Import modules were not robust enough and not working as they should.
- /kaggle/work, /kaggle/input folders etc are completely foreign concept to me. I organized data my own way… very very differently.
- While prediction was working (but not really visually checked), submission to Kaggle was not. The RLE encoding and decoding schema was notoriously buggy, annoying to work with and difficult to bug check. This lead to almost 75% of the dev time preparing for these. Still, it has bug in the end at even the local validation stage. Next time, i will just use a package or some tried and true public code to do this. God damn it.
- Speaking of dev time, 3 hours per week was already far cry from enough. But having 2 full weeks of almost 100% off hours dedicated to ML interview preparation was TERRIBLE idea. Oh, also I was away on vacation for one full week too. These all happened starting one month before the Kaggle deadline are very very bad combinations. Need to clear calendar for the two weeks BEFORE Kaggle deadlines.
- Overall, I think all these epic fails are the direct results of invalidated/insufficient data pipeline works and lack of iterative improvement submission processes. The deep learning part was not too bad (as in, it at least runs, but no clue how well it generalized) but the the check/iterate/improve portion was very very poorly handled. Despite the recent post interview data pipeline improvement, it failed to integrate with the Kaggle Kernel computation environment.
- More importantly, the lack of offline integration with online Jupyter analyses pipeline was also likely the cause of multiple analyses bug issues.
Tomorrow I am at the 2019 REWork DeepLearning summit and I will see what I can fix but given that my DeepLabV3+ models are not output binary labels, these are some very terrible signs. Also, from teh 836+ test images masks CSV generated, having two pixels of defects suggests… terrible performance. Shoudl have examined these sooner. Not the day of the submission… sigh.
Pretty neat conference. Heard quite a few sessions on AI and various efforts of implementation in clinical practises. As expected, technology is the super simple part. The ethics models ND compensation is what is hell…
I was working on this Flask app, which require data base commitment after data model changes.
In a SINGLE line of code composed of THREE English words, I manage to make two consecutive typos, took two debug sessions to discover the bugs. Impressive indeed.
This is the reason I do not work on production system and never touch anything mission critical.
Occasionally, I help others build website.s
One client of mine approached me asking me to help them out. Sure thing, no issues. You want a website, you get a website. Nothing fancy.
Then I emailed to ask what domain name they wanted and then they turned back asking me what is a “domain name”, is that the name of the website? I hesitated, pondering my response, purpose of my existence, and how to best respond to that question.
- LMGTFY: passive, but aggressive and maybe teach a lesson?
- Explain the concept: teach a lesson?
- Pretend to never seen this email?
- Drop the client and move on with my life?
- Teach them how to look up an unknown concept? Because surely, everyone in the world has heard Wikipedia by now?
It really takes less than 5 minutes to educate yourself to learn something new.
Not everyone does that.
I will try to stick with those who at least attempts to educate themselves.
But I have to work with people like that. In the end, I wrote her a short email explaining that.
Such, is the fate of working with people. You do what needs to be done to get things moving.
Upgraded my old laptop SSD to NVME 1TB recently via cloning through a external M.2 enclosure. All went well. Decided to recycle the old laptop SSD using that enclosure for a SPEEDY USB.
Massive giddy ensues.
Then realize… can never see it format on Win/Linux fail to even initialize. Much research later, it dawned upon me that old drive is SATA M2 SSD with B+M keys and new enclosure only take M keys NVME M.2.
Now back to order B+M keys compatible SSD enclosure while my M NVME enclosure sits empty, yearning for speedy copying…
This is really annoying.
I learned initially it is train, test, validate (with a hold out test dataset). But lately, I learned I might have been very mistaken, and should be instead using Train, Validate, Test(with a hold out test dataset).
I shall name it TVT instead of TTV.
So today a LinkedIn tech recruiter (who also happen to be a nice lady) and I got on the phone. After initial nice greeting, I managed to fumbled my reply to be: “So, what I can do to you today…” instead of the more normal human interaction like “what can I do FOR you today?” I am so getting sued. LOL.
I commute 15 minutes from home to work by car (yes, very lucky). I rarely bothered to pair my Bluetooth with the car audio to listen to anything productive and just usually let radio play or silence.
Then, I started forcing myself to take the extra few minutes to hook up the bluetooth to the car (not same car, car sharing) and listen to some audible book, before you realize it, I finished several audio books already. 15 minutes per trip, 30 minutes per day, 3.5 hours per week and two weeks of commuting gets 7 hours of audio content which is typically the length of a shorter book. Get some exciting book like those that teach you to negotiate better, or tell you about cool history or something. I am glad Audible is making me the habbit of listening to stuff again. I am very grateful.
This little sucker took me wayyyyyyyyyyyy too long to figure out. I was coding in C# on Windows and this error popped up when trying to save a file.
What it probably should have said, is “Dude, I cannot save your shxt to a file. Something went wrong with file creation process that before I write data to it, it stopped working. Find that out, fix it!”
I was trying to save an EMGU image to a path but keep getting this error. It turned out how I have specified the file path incorrectly to the tune of “AAAA\BBBB\CCC\.jpeg” and that file creation process was not valid unless with a properly named file (without . at the beginning most likely)