Author Archives: Dale Markowitz

Can AI make me trendier?

As a software engineer and generally analytic type, I like to craft theories for everything. Theories on how to build software, how to stay productive, how to be creative...and even how to dress well. For help with that last one, I decided to hire a personal stylist. As it turned out, I was not my stylist’s first software engineer client. “The problem with you people in tech is that you’re always looking for some sort of theory of fashion,” she told me. “But there is no formula–it’s about taste.”

Unfortunately my stylist’s taste was a bit outside of my price range (I drew the line at a $300 hoodie). But I knew she was right. It’s true that computers (and maybe the people who program them) are better at solving problems with clear-cut answers than they are at navigating touchy-feely matters, like taste. Fashion trends are not set by data-crunching CPUs, they’re made by human tastemakers and fashionistas and their modern-day equivalents, social media influencers. 

I found myself wondering if I could build an app that combined trendsetters’ sense of style with AI’s efficiency to help me out a little. I started getting fashion inspiration from Instagram influencers who matched my style. When I saw an outfit I liked, I’d try to recreate it using items I already owned. It was an effective strategy, so I set out to automate it with AI.

First, I partnered up with one of my favorite programmers, who just so happened to also be an Instagram influencer, Laura Medalia (or codergirl_ on Instagram). With her permission, I uploaded all of Laura’s pictures to Google Cloud to serve as my outfit inspiration.
Image showing a screenshot of the Instagram profile of "codergirl."

Next, I painstakingly photographed every single item of clothing I owned, creating a digital archive of my closet.

Animated GIF showing a woman in a white room placing different clothing items on a mannequin and taking photos of them.

To compare my closet with Laura’s, I used Google Cloud Vision Product Search API, which uses computer vision to identify similar products. If you’ve ever seen a “See Similar Items” tab when you’re online shopping, it’s probably powered by a similar technology. I used this API to look through all of Laura’s outfits and all of my clothes to figure out which looks I could recreate. I bundled up all of the recommendations into a web app so that I could browse them on my phone, and voila: I had my own AI-powered stylist. It looks like this:

Animated GIF showing different screens that display items of clothing that can be paired together to create an outfit.

Thanks to Laura’s sense of taste, I have lots of new ideas for styling my own wardrobe. Here’s one look I was able to recreate:

Image showing two screens; on the left, a woman is standing in a room wearing a fashionable outfit with the items that make up that outfit in two panels below her. In the other is another woman, wearing a similar outfit.

If you want to see the rest of my newfound outfits, check out the YouTube video at the top of this post, where I go into all of the details of how I built the app, or read my blog post.

No, I didn’t end up with a Grand Unified Theory of Fashion—but at least I have something stylish to wear while I’m figuring it out.


30 years of family videos in an AI archive

My dad got his first video camera the day I was born nearly three decades ago. “Say hello to the camera!” are the first words he caught on tape, as he pointed it at a red, puffy baby (me) in a hospital bassinet. The clips got more embarrassing from there, as he continued to film through many diaper changes, temper tantrums and—worst of all—puberty.

Most of those potential blackmail tokens sat trapped on miniDV tapes or scattered across SD cards until two years ago when my dad uploaded them all to Google Drive. Theoretically, since they were now stored in the cloud, my family and I could watch them whenever we wanted. But with more than 456 hours of footage, watching it all would have been a herculean effort. You can only watch old family friends open Christmas gifts so many times. So, as an Applied AI Engineer, I got down to business and built an AI-powered searchable archive of our family videos.

If you’ve ever used Google Photos, you’ve seen the power of using AI to search and organize images and videos. The app uses machine learning to identify people and pets, as well as objects and text in images. So, if I search “pool” in the Google Photos app, it’ll show me all the pictures and videos I ever took of pools.

But for this project, I needed a couple of features Photos doesn’t (yet!) support. First, because my dad’s first camera recorded footage to miniDV tapes, those videos were uploaded as meaty, two-hour-long movies with no useful metadata. Instead, my dad would start a clip by saying, “let me put a date on the screen here...” and a little white text snippet would appear in the bottom right corner of the frame. In between shots on a single reel, he’d say: “Say goodbye, I’m going to fade out now.” I would scream, “NO, DON’T FADE OUT,” while the screen faded to black. So, my first step was to use machine learning to automatically parse the date shown on the screen, and split the single long video into shorter clips after each fade out.

video screenshot

In this picture, you can see the timestamp shown on screen. Using the Vision API, I could extract it to sort my videos by date.

For this, I turned the Video intelligence API, a Google Cloud tool that lets developers analyze videos with machine learning. It allows you to replicate many of the features found in the Google Photos app—like tagging objects in images and recognizing on-screen text—and a whole lot more. For example, the API’s shot change detection feature automatically finds the timestamps in videos where a scene changes, this allowed me to split those longs videos into smaller chunks. 

Using the label detection feature, I could search for all sorts of different events, like “bridal shower,” “wedding,” “bat and ball games” and “baby.” By searching “performance,” I was able to finally find one of my life’s proudest accomplishments on tape—a starring role singing “It’s Not Easy Being Green” in my kindergarten’s production of the Sesame Street musical.

home video 2

My starring role as Kermit the Frog in my school’s Sesame Street musical. The Video Intelligence API tagged it as “performance”.  

The Video Intelligence API’s real “killer feature” for me was its ability to do audio transcription. By transcribing my videos, I was able to query clips by what people said in them. I could search for specific names (“Scott,” “Dale,” “grandma”), proper nouns (“Chuck E Cheese”, “Pokemon”), and for unique phrases. By searching “first steps,” I found a clip of my dad saying, “Here she comes… plunk. That’s the first time she’s taken major steps” alongside a video of my managing, just barely, to waddle along.

homevideo3

My first steps that I was able to find with the Video Intelligence API’s Transcription feature. Here, my dad says, “...this is the first time she’s taken major steps.”

In the end, machine learning helped me build exactly the kind of archive I wanted—one that let me search my family videos by memories, not timestamps.

P.S. Want to see how I built it? Check out my technical blog post or catch the video on the Cloud Youtube Channel