In today’s business climate, the need for companies to be agile, innovative and able to scale is greater than ever. APIs are drivers for these needs but before an organization can reap the benefits of APIs, it must set up an API strategy.
Developing an API strategy can be broken down into four stages: establishing your strategy, aligning your organization and culture, building the technology needed to support the strategy and engaging with your ecosystem. This series of articles examines the fourth stage.
All successful API providers make it a priority to actively engage with their developer communities. Doing so helps build a vibrant ecosystem of developers and partners who can extend the reach of their API strategy. ProgrammableWeb has spoken to a number of providers about their best practices for engaging with developers. This article will focus on Databricks, provider of a data analytics and AI Platform. We spoke with Ryan Boyd, Head of Developer Relations at Databricks.
Boyd’s experience at Databricks as well as previous companies has influenced his view that developer engagement is not a one-sided conversation but is a dialog between the company and its developer community. He discusses how those conversations can be fostered through online meetups and his plans for making them even more engaging in the coming year.
Boyd also talks about the importance of treating an API as a product in order to help developers feel that it will be viable and well supported for years to come.
To find out more about how Databricks approaches developer engagement, read the transcript of the interview below. This interview has been edited for clarity and length.
ProgrammableWeb: Hi this is Wendell Santos, editor of ProgrammableWeb, and today I’m speaking with Ryan Boyd, Head of Developer Relations at Databricks. Ryan, thank you for taking a few minutes to chat with me. I was wondering if you could take a couple of minutes and tell our readers a little bit about yourself, some of your background, and also a little bit about Databricks for anyone who might not be familiar with the company.
Ryan Boyd: Sure. I've been in the field of Developer Relations for about 15 years now. Starting my career at Google as one of the first engineers in their API support team, which grew into their Developer Relations team. My focus at Google was on how to make developers successful using Google Data APIs, Google Apps APIs, and eventually, the Google Cloud Platform, where I led the Developer Relations Team. And really what motivated me throughout my career at Google and then moving on to some other organizations is how can we have technology success, yield business success, yield personal success for the developers in our community. I did a lot of that at Google, then went on to spend five years at Neo4j and eventually joined Databricks. Taking the kind of career path of focusing on developer success or what we now call practitioner's success in the data and AI space.
Databricks being the day-to-day AI company felt like a perfect fit for that. A lot of people know Databricks by its origins. It was founded by the original creators of Apache Spark, who then went to create a number of other open-source projects like Delta Lake and MLflow. Databricks, in general, helps organizations solve tough problems by helping them make sense of their data with an end goal of providing better healthcare or decreasing manufacturing failures or finding patterns in economic data.
The technology foundation for all those problems is the same. We help data engineers GET data from their customers, users, and machines into the right form to help them start using it. Next, we help the data analyst pool the data from their data lake houses into their software for analysis, and then help data scientists and Machine Learning engineers develop the models and predictive algorithms. Really it's kind of bringing together all of these different disciplines into a team sport, bringing them on the same playing field, enabling them to work together to solve their data problems.
PW: Okay, great. And you've been at Databricks, how long now?
Boyd: Just over a year. I joined in mid-October of last year.
PW: Congratulations then. I'm sure early days were kind of like drinking from the fire hose, and now hopefully you’ve gotten your feet under you a little. What can you tell me about your strategy for developer engagement since you've been at Databricks?
Boyd: A lot of my view of developer relations and developer engagement is about filling in the gaps and making sure that there is a smooth pathway between the company out to the community, and also from the community back into the company. It is that two-way street. I'm one to always say that you can't really define developer relations because it is filling in those gaps that exist elsewhere in the organization oftentimes. But we've had a couple of different focus areas over the last year. First, our Databricks University Alliance tries to help professors teach data science, machine learning analytics, cetera in the classroom by providing them tools as well as expertise. We’re working on a partnership with Coursera to release MOOCs (massive open online courses) that cover both the practical parts of how to use technology as well as the underlying principles because we have a lot of amazing experts within the company and the community.
Second, we also do a lot of publication of content such as books, videos, and other content. The goal is to share best practices and make it easier for developers to adopt. And third, online meetups, engaging with the community more directly. It's kind of funny because I've been a strong proponent of online meetups as a way to bring a more diverse crowd into our community, both at Databricks as well as at Neo4j.
We kicked that off, and then a month or two later, COVID hit. So our timing worked out, kind of accidental. But we really want to reach people and the developers where they are and the communities that they're in. At Neo4j we noticed a lot of times that even in the same city due to traffic congestion and things like that, in Sao Paulo [for example], our meetups were often online just for other people in Sao Paulo. [We found that] you couldn't get the folks together in the same room due to traffic and also let them have their day jobs. I strongly believe in building the online community, so that's been a big focus for us this year.
PW: Okay. There's a couple of interesting pieces you mentioned that, I was hoping we could dive into. Can you explain a little bit about Databricks University?
Boyd: Yeah. Databricks was founded by the original creators of Apache Spark when they were completing their Ph.D. tenure at Berkeley. They were in the AMPlab at Berkeley where they created this technology and eventually launched the company out of it. So there's a solid foundation and respect for academia as it allows us to advance technology in the data and AI space. Our founders wanted us to have more of a presence in universities because it helps educate the next generation of technology professionals and it also gets [our founders] back to their roots and allows us to give back to the educational community.
As for the University Alliance, we've had over a hundred universities engaged since it launched in Q2 of 2020. Each university gets free access to the Databricks software and we're not looking for them to teach the software in the classroom. We want to provide the tools that enable them to teach the underlying topics such as Data Science, Machine Learning, Analytics, and make it easy for them. It’s great that we can provide the cloud technology that they need to teach those topics, but then we've also brought together the community of professors in that area. The professors hold weekly office hours to share knowledge amongst each other and some have done panels on how to teach these topics in the classroom. University Alliance offers them the platform to share their knowledge while providing the tools and resources they need to teach.
PW: Very good. I also want to talk a little bit more about the online meetups. You said that you wanted to reach developers where they are and that the timing was spot on given COVID. How did you actually go about doing that?
Boyd: Through our origins with Apache Spark we have a very large global community made up of people that are passionate about educating others about Spark. Within those communities we had these very technology-focused or individual technology-focused meetups happening all around the world prior to COVID. A big part of organizing the online meetup was reaching out to that community of physical meetups and providing knowledge of the online meetup. We also had an initial online Apache Spark meetup that we kind of converted over to being a broader Data + AI Online Meetup.
It’s really about bringing the whole data community together. We've been able to grow a decent community, I think probably over 4,800 or so members right now, that comes back every week. We’ve also looked for other channels as a way to continue to grow that beyond just the folks that have in the past attended our physical meetups.
For example, LinkedIn has a beta for live broadcasting that we’ve been using. We also broadcast live on YouTube. We can then share it out through other social media and some of the folks end up signing up for the meetups. Some of the folks from our social channels end up signing up or subscribing to our YouTube channel and then keep coming back. Right now we have 100-200 folks that join us live every week as we do these meetups. And that audience has been really growing as we've both increased the channels, as well as increased the variety of content we're offering.
PW: Do you have any goals or KPIs around the online meetups?
Boyd: The goals around the Data + AI Online meetup have been largely around member growth and content consumption. What we want to figure out is how to make the online meetups more interactive, how to get the community members more engaged with each other, not just with us. We're aiming to do more things around that. We've had a variety of different guest presenters at the online meetups and we would like to do more. We want to have an open CFP (call for proposal) in order to get a wider variety of the community engaged. But we also want to figure out how to make the conferencing platforms more interactive. We do Q and A sessions, but you can’t see people’s faces during those.
Going into the next year our goals are going to go from being more around content consumption to being more around getting folks to be more interactive and involved in the content, getting the people in the community more involved in interacting with each other and in producing the content.
PW: That must be a different challenge nowadays. Before, when events were in person, there's naturally a desire for attendees to connect, but now, everyone is behind a screen and kind of anonymous, so that is trickier. Do you find that the people attending online events are having some level of engagement with each other?
Boyd: I think attendees are sharing ideas right now mostly in the Q and A sessions for the speakers. One person in the community asks questions, and other people get ideas from that and continue to ask questions. There's that type of thing. We also have chat streams that go along on the side of it. So we do see some of that, it's just a matter of how to improve it.
PW: Is the participation for these events mostly coming from the United States, or are you seeing a lot of participation in other places around the world?
Boyd: It's really all around the world. I mean our developer base, and I use that word and practitioner kind of interchangeably, but from a practitioner perspective, we have people from all around the globe. I think that Databricks certainly is very popular.
India is one country that stands out a lot. There are a lot of folks in the developer community in India. They may be working for multi-nationals, but there are still developers who have a need to talk with other developers, and these communities provide a way for them to do that.
PW: I wanted to switch gears and ask about your API. One area I'm always curious about is Documentation. What role, if any, for you guys does documentation play as far as engaging with your developers?
Boyd: It’s very important. The way that you measure success for developers can often be how long does it take to do X, right? If you think about the getting started experience, our developer relations team is working closely with the product management team and the product design team on shortening that time to what we call the first query or the first notebook command execution. We really want to bring that time down as much as possible. That's the type of metric that we're using to measure if we are making developers successful on the platform. And documentation plays an important role in that, right? Today, we have a lot of amazing reference documentation that shows users how everything works within Databricks and within the variety of different source products that we have underneath the covers, like Delta Lake, Spark, MLflow.
Our focus area now is trying to do more scenario-based getting started documentation. So if a developer wants to accomplish [a specific task] here's what they need to do. We first focused on making it possible for people to find the documentation for every Function that they need to get something done. Now we're trying to increase the accessibility of that. That means things like more code completion within notebooks with pop-up syntax or function definitions, as you would find in IDEs. I think we're doing great from the reference style, but there's a lot of work that we can do to make sure that we document the various scenarios developers come across when using Databricks.
That's something that I and my team aim to do over the next year, but we also want to work more with the community to do it. I'm a strong believer in driving the community feedback into the documentation.
Think of PHP as an example. Everyone in the world busts on PHP, and for a lot of good reasons, but they've also done some things amazingly well. One of those has been allowing their community to make great contributions to their documentation. If you go to the PHP docs, the official doc may say the wrong thing, but you can find a good example of how to accomplish something down below in the comments. The result is that people can accomplish things with PHP even if they have less knowledge than the typical software engineer has.
Our aim is to do both of those things. Our docs should say the right thing, but we should also have the community chiming in and saying, "Hey, here's some best practices that we can share." That's another area where I aim to involve the community more and be more interactive is around the core documentation, but also around the kind of getting started and scenario guides and things like that. Give the community a platform to share their tips and tricks and best practices there.
PW: Can you describe what that platform could look like? I’ve spoken with teams that run a guest blogger program where users can share tutorials. Some have mentioned making their documentation collaborative. Are you looking at similar ideas, something different, or are you still ideating on the best approach?
Boyd: I've experienced all of those different ways and each has their pros and cons. I don't think we've nailed down exactly how we want to do that. Among our open-source projects, we're aiming for the Delta Lake website to have more of a pull request capability. That would allow you to submit a pull request to update the website, add tutorials, add guides, add reference docs.
I think at a minimum, yes, you want to give people a platform in terms of a voice, in terms of sharing content either via our blogs or social media. I'm actually a fan of incorporating content into the core website. Blogs, I truly view as temporal things. It's great to have temporal content about how to do something, but you also want the long-lived content as well.
When I was at Google, for instance, I would allow contributors to write content for the core documentation set. At the top it would say, "Hey, this developer from this company submitted this. Thanks very much." It gave them a bit of recognition and at the same time, gave useful content for the community. I think you may see a combination of those approaches, but I don't know which will launch first at this point.
PW: How are the current docs built by the way? Are they done by hand? Is it automated in any way?
Boyd: I actually am not able to comment too much on that. The docs team is led by someone in our organization with the product side. I can say that they are very much focused on the automation of that process and the workflow as every docs team I've worked with at Neo4j and at Google has been. That includes not only how the docs are built, but also figuring out how to test them to ensure they're up to date, and how to automate the tests of the code snippets within the docs. All of that is done. I don't know the specifics of the underlying technology that they use for that though.
PW: This is slightly off-topic, but you've mentioned your background at Google a couple of times. One thing that popped in my head is that Google has had a lot of great APIs over the years, but they've also shut down a lot of APIs over the years. It will try an API out for a year or two, and then for whatever reason, decide not to support it any longer. Have you taken away any lessons from that approach that you have been able to apply going forward at Neo and then at Databricks? Pros or cons?
Boyd: When I was at Google, I helped create a program called the Google Codelabs, which was basically Google Labs but for the developer space. I did that because developers do want stability, but developers also want innovation. When I started at Google, I think that our API directory had five APIs in it and by the time I left, it was over 200. I think the reason there were so many APIs is because people demanded that level of Integration and were excited by it and wanted that innovation. But that innovation came at a cost, as you pointed out.
Originally, with the Codelabs project, I was trying to mark APIs as enterprise-ready or production-ready. Then the lawyers said that I could only label which APIs are not ready for production. Eventually, Google deprecated the Codeslab program. What I learned is that it's really tough to balance the pace of innovation with the stability that developers need.
As I’ve moved into the enterprise world I’ve seen how the space definitely demands more in terms of predictability and stability on APIs. I'm a believer of not launching something unless you are pretty confident that that API is going to be around for a while. Unless you're pretty confident that you have the team there to support that API and make that API work, take feature requests.
I’ve come across people in the past that would launch an API and say, "Yeah, it's super well tested. It's super well documented." And I'd ask, "Okay, so who are the people? Who are the engineers? Who are the product managers that are focused on this API going forward?” If their response was, "It's these interns. They just finished up." I’d have to tell them, "no, sorry, that's not an API. If the intern who's no longer at the company is the one that's in charge of it."
I’ve learned in my time, from working on the Google cloud platform, then on to Neo4j and Databricks, that when you are serving the enterprise community, it is even more important that you have clear and open communication about what level of support and longevity to expect from an API.
Then if people choose to build on it anyway, so be it. If you clearly communicate that this thing could be deprecated six months or now, and they still choose to build with it, great. They have an early mover advantage. If the company later ends up building something similar to it, the developers still have that first-mover advantage and build up a loyal following from it. Sorry, that's getting a little bit off-topic, but I've learned a lot in terms of what people expect, and especially in the enterprise world, people expect more in terms of stability and longevity, and I will always be the voice of the community internally in any company I'm working for to try to make that happen. And so far at zero resistance to it at Databricks.
I’d like to thank Ryan Boyd for taking the time to speak with me and share Databricks’ approach to developer engagement. This is part of a series of interviews with developer relations experts such as Randall Degges at Okta and Lisa-Marie Namphy at Cockroach Labs. Be sure to keep an eye out for future interviews coming soon.