react-native-vision-camera/docs/docs/guides/FRAME_PROCESSORS.mdx

319 lines
12 KiB
Plaintext

---
id: frame-processors
title: Frame Processors
sidebar_label: Frame Processors
---
import Tabs from '@theme/Tabs'
import TabItem from '@theme/TabItem'
import useBaseUrl from '@docusaurus/useBaseUrl'
<div class="image-container">
<svg xmlns="http://www.w3.org/2000/svg" width="283" height="535">
<image href={useBaseUrl("img/frame-processors.gif")} x="18" y="33" width="247" height="469" />
<image href={useBaseUrl("img/frame.png")} width="283" height="535" />
</svg>
</div>
## What are frame processors?
Frame processors are functions that are written in JavaScript (or TypeScript) which can be used to process frames the camera "sees".
Inside those functions you can call **Frame Processor Plugins**, which are high performance native functions specifically designed for certain use-cases.
For example, you might want to create an object detector app without writing any native code, while still achieving native performance:
```jsx
function App() {
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const objects = detectObjects(frame)
console.log(`Detected ${objects.length} objects.`)
}, [])
return (
<Camera
{...cameraProps}
frameProcessor={frameProcessor}
/>
)
}
```
Frame processors are by far not limited to object detection, other examples include:
* **ML** for **facial recognition**
* Using **Tensorflow**, **MLKit Vision**, **Apple Vision** or other libraries
* Creating **realtime video-chats** using **WebRTC** to directly send the camera frames over the network
* Creating scanners for custom codes such as **Snapchat's SnapCodes** or **Apple's AppClips**
* Creating **snapchat-like filters**, e.g. draw a dog-mask filter over the user's face
* Creating **color filters** with depth-detection
* **Drawing** boxes, text, overlays, or colors on the screen in realtime
* Rendering **filters** and shaders such as Blur, inverted colors, beauty filter, or more on the screen
Because they are written in JS, Frame Processors are simple, powerful, extensible and easy to create while still running at native performance. (Frame Processors can run up to 1000 times a second!) Also, you can use fast-refresh to quickly see changes while developing or publish [over-the-air updates](https://github.com/microsoft/react-native-code-push) to tweak the object detector's sensitivity in live apps without pushing a native update.
### react-native-worklets-core
Frame Processors require [react-native-worklets-core](https://github.com/margelo/react-native-worklets-core) 0.2.0 or higher. Install it:
```sh
npm i react-native-worklets-core
```
And add the plugin to your `babel.config.js`:
```js
module.exports = {
plugins: [
// highlight-next-line
['react-native-worklets-core/plugin'],
],
}
```
## The `Frame`
A Frame Processor is called for every Camera frame, and exposes information about the frame in the [`Frame`](/docs/api/interfaces/Frame) parameter.
The [`Frame`](/docs/api/interfaces/Frame) parameter wraps the native GPU-based frame buffer in a C++ HostObject (a ~1.5MB buffer at 4k), and allows you to access information such as it's resolution or pixel format directly from JS:
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
console.log(`Frame: ${frame.width}x${frame.height} (${frame.pixelFormat})`)
}, [])
```
Additionally, you can also directly access the Frame's pixel data using [`toArrayBuffer()`](/docs/api/interfaces/Frame#toarraybuffer):
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
if (frame.pixelFormat === 'rgb') {
const buffer = frame.toArrayBuffer()
const data = new Uint8Array(buffer)
console.log(`Pixel at 0,0: RGB(${data[0]}, ${data[1]}, ${data[2]})`)
}
}, [])
```
It is however recommended to use native **Frame Processor Plugins** for processing, as those are much faster than JavaScript and can sometimes operate with the GPU buffer directly.
You can simply pass a `Frame` to a native Frame Processor Plugin directly.
## Interacting with Frame Processors
### Access JS values
Since Frame Processors run in [Worklets](https://github.com/margelo/react-native-worklets-core/blob/main/docs/WORKLETS.md), you can directly use JS values such as React state which are readonly-copied into the Frame Processor:
```tsx
// User can look for specific objects
const targetObject = 'banana'
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const objects = detectObjects(frame)
const bananas = objects.filter((o) => o.type === targetObject)
console.log(`Detected ${bananas} bananas!`)
}, [targetObject])
```
### Shared Values
You can also easily read from, and assign to [Shared Values](https://github.com/margelo/react-native-worklets-core/blob/main/docs/WORKLETS.md#shared-values), which can be written to from inside a Frame Processor and read from any other context (either React JS, Skia, or Reanimated):
```tsx
const bananas = useSharedValue([])
// Detect Bananas in Frame Processor
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const objects = detectObjects(frame)
bananas.value = objects.filter((o) => o.type === 'banana')
}, [bananas])
// Draw bananas in a Skia Canvas
const onDraw = useDrawCallback((canvas) => {
for (const banana of bananas.value) {
const rect = Skia.XYWHRect(banana.x,
banana.y,
banana.width,
banana.height)
const paint = Skia.Paint()
paint.setColor(Skia.Color('red'))
frame.drawRect(rect, paint)
}
})
```
### Call functions
And you can also call back to the React-JS thread by using `createRunInJsFn(...)`:
```tsx
const onFaceDetected = Worklets.createRunInJsFn((face: Face) => {
navigation.push("FiltersPage", { face: face })
})
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const faces = scanFaces(frame)
if (faces.length > 0) {
onFaceDetected(faces[0])
}
}, [onFaceDetected])
```
## Threading
By default, Frame Processors run synchronously with the Camera pipeline. Anything that takes longer than one Frame interval might block the Camera from streaming new Frames.
For example, if your Camera is running at **30 FPS**, your Frame Processor has **33ms** to finish executing before the next Frame is dropped. At **60 FPS**, you only have **16ms**.
### Running asynchronously
For longer running processing, you can use [`runAsync(..)`](/docs/api/#runasync) to run code asynchronously on a different Thread:
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
console.log("I'm running synchronously at 60 FPS!")
runAsync(frame, () => {
'worklet'
console.log("I'm running asynchronously, possibly at a lower FPS rate!")
})
}, [])
```
### Running at a throttled FPS rate
Some Frame Processor Plugins don't need to run on every Frame, for example a Frame Processor that detects the brightness in a Frame only needs to run twice per second. You can achieve this by using [`runAtTargetFps(..)`](/docs/api/#runattargetfps):
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
console.log("I'm running synchronously at 60 FPS!")
runAtTargetFps(2, () => {
'worklet'
console.log("I'm running synchronously at 2 FPS!")
})
}, [])
```
## Native Frame Processor Plugins
Since JavaScript is slower than native languages, it is recommended to use native Frame Processor Plugins for heavy processing.
Such native plugins benefit of faster languages (Objective-C/Swift, Java/Kotlin, or C++), and can make use of CPU-Vector- or GPU-acceleration.
### Creating native Frame Processor Plugins
VisionCamera provides an easy-to-use API for creating native Frame Processor Plugins, which are used to either wrap existing algorithms (example: ["MLKit Face Detection"](https://developers.google.com/ml-kit/vision/face-detection)), or build your own custom algorithms.
It's binding point is a simple callback function that gets called with the native frame type (`CMSampleBuffer` or `Image`), that you can use for any kind of processing.
The native plugin can accept parameters (e.g. for configuration) and return any kind of values for result, which are bridged through JSI.
See: ["Creating Frame Processor Plugins"](/docs/guides/frame-processors-plugins-overview).
### Using Community Plugins
Community Frame Processor Plugins are distributed through npm. To install the [vision-camera-resize-plugin](https://github.com/mrousavy/vision-camera-resize-plugin) plugin, run:
```bash
npm i vision-camera-resize-plugin
cd ios && pod install
```
That's it! 🎉 Now you can use it:
```ts
const { resize } = useResizePlugin()
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const smallerFrame = resize(frame, {
size: {
// ...
},
})
// ...
}, [resize])
```
Check out [Frame Processor community plugins](/docs/guides/frame-processor-plugins) to discover available community plugins.
## Selecting a Format for a Frame Processor
When running frame processors, it is often important to choose an appropriate [format](/docs/guides/formats). Here are some general tips to consider:
* If you are running heavy AI/ML calculations in your frame processor, make sure to [select a format](/docs/guides/formats) that has a lower resolution to optimize it's performance. You can also resize the Frame on-demand.
* Sometimes a frame processor plugin only works with specific [pixel formats](/docs/api/interfaces/CameraDeviceFormat#pixelformats). Some plugins (like Tensorflow Lite Models) don't work with `yuv`, so use a [`pixelFormat`](/docs/api/interfaces/CameraProps#pixelformat) of `rgb` instead.
* Some Frame Processor plugins don't work with HDR formats. In this case you need to disable [`videoHdr`](/docs/api/interfaces/CameraProps#videohdr).
## Benchmarks
Frame Processors are _really_ fast. I have used [MLKit Vision Image Labeling](https://firebase.google.com/docs/ml-kit/ios/label-images) to label 4k Camera frames in realtime, and measured the following results:
* Fully natively (written in pure Objective-C, no React interaction at all), I have measured an average of **68ms** per call.
* As a Frame Processor Plugin (written in Objective-C, called through a JS Frame Processor function), I have measured an average of **69ms** per call.
This means that the Frame Processor API only takes ~1ms longer than a fully native implementation, making it **the fastest and easiest way to run any sort of Frame Processing in React Native**.
## Disabling Frame Processors
The Frame Processor API spawns a secondary JavaScript Runtime which consumes a small amount of extra CPU and RAM. Additionally, compile time increases since Frame Processors are written in native C++. If you're not using Frame Processors at all, you can disable them:
<Tabs
groupId="environment"
defaultValue="rn"
values={[
{label: 'React Native', value: 'rn'},
{label: 'Expo', value: 'expo'}
]}>
<TabItem value="rn">
### Android
Inside your `gradle.properties` file, add the `disableFrameProcessors` flag:
```groovy
VisionCamera_disableFrameProcessors=true
```
Then, clean and rebuild your project.
### iOS
Inside your `Podfile`, add the `VCDisableFrameProcessors` flag:
```ruby
$VCDisableFrameProcessors = true
```
</TabItem>
<TabItem value="expo">
Inside your Expo config (`app.json`, `app.config.json` or `app.config.js`), add the `disableFrameProcessors` flag to the `react-native-vision-camera` plugin:
```json
{
"name": "my app",
"plugins": [
[
"react-native-vision-camera",
{
// ...
"disableFrameProcessors": true
}
]
]
}
```
</TabItem>
</Tabs>
<br />
#### 🚀 Next section: [Zooming](/docs/guides/zooming) (or [creating a Frame Processor Plugin](/docs/guides/frame-processors-plugins-overview))