Love this. Thoroughly interested in jumping on this wagon.
Just throwing an idea out there: Instead of re-inventing the proverbial wheeled base, I wonder if it would make sense to build on top of some established wheel bases, like the turtlebot (https://www.turtlebot.com/). And then go ahead and build an array of manipulator/grasper options on top.
It’ll get you an established supply chain (including of replacement parts) and let you focus in on your specialty here: the reconfigurable agentic behavior stack.
It would be a no-brainer for me to jump on this and start building if the base was a turtlebot!
The mobile base below the hull is a turtlebot 3 :) The arm is also a modified version of the open source arm by Alexander Koch
We do a lot of it with off-the-shelf components, and we design the whole system so that we can quickly iterate, manufacture and ship from our garage in Palo Alto.
Is this stable and fast enough that I could hand it a camera and train it to point that camera at myself as I move around playing a musical instrument to create a music video?
You're not the first person to ask me this! If you look closely, there is actually a camera on the arm (that's partly how it learns tasks so fast) and we can use it to take pictures too. You can definitely define a primitive that would be "take picture", another one that is "send to server" and then have your own software assemble it in the way you want. Or just record and send it to the cloud / your computer.
Now if it's about using a better camera that the robot would hold, you'll need to wait for the next generation that we'll reveal later next year
very cool! Both demos were very entertaining and charming. Could you share other behaviors/tasks that you foresee Maurice being able to tackle? I personally have trouble brainstorming tasks that I would find useful.
This is pretty cool. I really like the simplicity.
While I was doing my PhD in HRI (~7 years ago), I played around with robots (mostly NAOs) to navigate and manipulate the real world. It was cool but really cumbersome.
As an HCI person, really glad to hear that! I think there's a LOT to explore here indeed, and it's they key to democratization. We have a lot of ideas to reveal in the coming months, such as teleoperation with just the phone, VR, web-sims to experiment without buying first...
It's also pretty rare to find HRI people, so I'm very happy to chat further if you're interested (there's a Discord link in the docs and on the website)
Have you thought about assistive technology/accessibility tasks as well? Would love to use such a device to control the touch screens on inaccessible coffee machines at my clients offices for example that I can't operate without sight. I'm sure there are way more examples of such things.
Throwing complex robots at inaccessible devices is not the proper solution, but by far the most quick and practical one. Not in the US, so not even able to buy one and I'm also hesitant to buy something that is totally bricked when the company/cloud goes under.
That's a great idea! We thought about in the context of elder care where they could ask the robot to perform a task for them, but we first need the models to be a little better - hence why we start here, to collect the data before it spreads further.
And by the way, we already have the app that you can use to control the robot at distance, so you can use the skills you taught it remotely as you make it navigate your home!
On the fact it would get bricked if the company goes under, note that our agent runs on other clouds so it's very easy to run if the company goes under - we would open-source it. But if you're not in the US we can't easily ship it to you for the first batch anyway :)
- do yall allow people or plan to allow people to swap out (some) of the models or orchestrate them yourself, at least incrementally? say I want to fine-tune my own maurice personality
- We estimate that in average it would use $1 to $3 a day and we picked that number so that it would be basically be free for you!
- Battery life is like 3-4 hours on a very high intensity usage. Most likely that should last a whole day once we start optimizing :) It's a big battery for now
- Yes we will allow full control. We don't intend to lock folks in our ecosystem, and if you look at the SDK you can see that you could actually train your own policies and just trigger them on the primitives. You could even scrap the whole agent, but then we think you lose a lot of the value of it, but why not?
Apart from the cloud agent, most of the code will be open source anyway!
Oh and on changing the personality I believe this could be done separately yes, the important reason why we keep our agent in the cloud is that it's going to improve quickly on decision-making with folks using it, but personality has little to do with that at the end of the day. Even inside the system, the model that decides what to say is separate!
It might seem bad but that's actually one of the coolest things about this new approach: it's the core model (today GPT-4o) that decides where it goes.
Here, this was a suboptimal decision by Maurice, and by default we indeed have it avoid making costly mistakes. But consider all the good decisions the agent did otherwise: navigating in all these different rooms with no prior knowledge of where anything is (just pictures it took earlier), close to the glass where Vignesh was, back to Axel, back to bed at the end...
And here's the thing: every time an LLM provider releases a new model, Maurice gets better. We haven't even started fine-tuning the agent yet but that will also improve its decisions a lot. There's many many low hanging fruits to make it able to make better decisions, and we expect that in the coming months the system will quickly get smarter and faster.
this is very cool, I've been playing around in the same space with a simple tracked robot and a 2dof gripper. you seem to be quite a bit ahead of me in functionality.
I'm using PaliGemma2 and MobileSAM for the vision part and Gemma for the thinking part. I'm hoping to stick with weights-available models as it's just a toy project.
for what it's worth this contraption cost under £200, but I'm using a desktop and a 3090 as the brains.
Love this. Thoroughly interested in jumping on this wagon.
Just throwing an idea out there: Instead of re-inventing the proverbial wheeled base, I wonder if it would make sense to build on top of some established wheel bases, like the turtlebot (https://www.turtlebot.com/). And then go ahead and build an array of manipulator/grasper options on top.
It’ll get you an established supply chain (including of replacement parts) and let you focus in on your specialty here: the reconfigurable agentic behavior stack.
It would be a no-brainer for me to jump on this and start building if the base was a turtlebot!
The mobile base below the hull is a turtlebot 3 :) The arm is also a modified version of the open source arm by Alexander Koch
We do a lot of it with off-the-shelf components, and we design the whole system so that we can quickly iterate, manufacture and ship from our garage in Palo Alto.
Is this stable and fast enough that I could hand it a camera and train it to point that camera at myself as I move around playing a musical instrument to create a music video?
TL;DR: Yes!
You're not the first person to ask me this! If you look closely, there is actually a camera on the arm (that's partly how it learns tasks so fast) and we can use it to take pictures too. You can definitely define a primitive that would be "take picture", another one that is "send to server" and then have your own software assemble it in the way you want. Or just record and send it to the cloud / your computer.
Now if it's about using a better camera that the robot would hold, you'll need to wait for the next generation that we'll reveal later next year
very cool! Both demos were very entertaining and charming. Could you share other behaviors/tasks that you foresee Maurice being able to tackle? I personally have trouble brainstorming tasks that I would find useful.
Congratulations on the launch.
This is pretty cool. I really like the simplicity.
While I was doing my PhD in HRI (~7 years ago), I played around with robots (mostly NAOs) to navigate and manipulate the real world. It was cool but really cumbersome.
I wish you all the best. Great UX is the key.
As an HCI person, really glad to hear that! I think there's a LOT to explore here indeed, and it's they key to democratization. We have a lot of ideas to reveal in the coming months, such as teleoperation with just the phone, VR, web-sims to experiment without buying first...
It's also pretty rare to find HRI people, so I'm very happy to chat further if you're interested (there's a Discord link in the docs and on the website)
Congrats on the launch.
Have you thought about assistive technology/accessibility tasks as well? Would love to use such a device to control the touch screens on inaccessible coffee machines at my clients offices for example that I can't operate without sight. I'm sure there are way more examples of such things.
Throwing complex robots at inaccessible devices is not the proper solution, but by far the most quick and practical one. Not in the US, so not even able to buy one and I'm also hesitant to buy something that is totally bricked when the company/cloud goes under.
That's a great idea! We thought about in the context of elder care where they could ask the robot to perform a task for them, but we first need the models to be a little better - hence why we start here, to collect the data before it spreads further.
And by the way, we already have the app that you can use to control the robot at distance, so you can use the skills you taught it remotely as you make it navigate your home!
On the fact it would get bricked if the company goes under, note that our agent runs on other clouds so it's very easy to run if the company goes under - we would open-source it. But if you're not in the US we can't easily ship it to you for the first batch anyway :)
Thanks, will keep an eye on your progress. If you want to discuss this kind of use cases in the future, feel free to get in touch.
Really cool!
Had a couple questions:
- how far does the $50/mo get you?
- what's the battery life like/going to be like?
- do yall allow people or plan to allow people to swap out (some) of the models or orchestrate them yourself, at least incrementally? say I want to fine-tune my own maurice personality
- We estimate that in average it would use $1 to $3 a day and we picked that number so that it would be basically be free for you!
- Battery life is like 3-4 hours on a very high intensity usage. Most likely that should last a whole day once we start optimizing :) It's a big battery for now
- Yes we will allow full control. We don't intend to lock folks in our ecosystem, and if you look at the SDK you can see that you could actually train your own policies and just trigger them on the primitives. You could even scrap the whole agent, but then we think you lose a lot of the value of it, but why not?
Apart from the cloud agent, most of the code will be open source anyway!
Oh and on changing the personality I believe this could be done separately yes, the important reason why we keep our agent in the cloud is that it's going to improve quickly on decision-making with folks using it, but personality has little to do with that at the end of the day. Even inside the system, the model that decides what to say is separate!
small nit, but why is the bot stopping so far from you? maybe it's the camera angle, but it looks like it wouldn't get within 10ft of either of you.
maybe for safety?
It might seem bad but that's actually one of the coolest things about this new approach: it's the core model (today GPT-4o) that decides where it goes.
Here, this was a suboptimal decision by Maurice, and by default we indeed have it avoid making costly mistakes. But consider all the good decisions the agent did otherwise: navigating in all these different rooms with no prior knowledge of where anything is (just pictures it took earlier), close to the glass where Vignesh was, back to Axel, back to bed at the end...
And here's the thing: every time an LLM provider releases a new model, Maurice gets better. We haven't even started fine-tuning the agent yet but that will also improve its decisions a lot. There's many many low hanging fruits to make it able to make better decisions, and we expect that in the coming months the system will quickly get smarter and faster.
Looking really cool. The preorder stripe page says $300 but it says deposit. What’s the actual price when available?
It also says $2,000 on stripe and on the website right?
Thanks for the comment though, means that we should make it clearer maybe?
It doesn’t on mobile, just checked again (browsing from the UK)
Thank you, on it!
this is very cool, I've been playing around in the same space with a simple tracked robot and a 2dof gripper. you seem to be quite a bit ahead of me in functionality.
https://imgur.com/a/WAHUIjQ
I'm using PaliGemma2 and MobileSAM for the vision part and Gemma for the thinking part. I'm hoping to stick with weights-available models as it's just a toy project.
for what it's worth this contraption cost under £200, but I'm using a desktop and a 3090 as the brains.