Marc Rousavy 37a3548a81
feat: Full Android rewrite (CameraX -> Camera2) (#1674)
* Nuke CameraX

* fix: Run View Finder on UI Thread

* Open Camera, set up Threads

* fix init

* Mirror if needed

* Try PreviewView

* Use max resolution

* Add `hardwareLevel` property

* Check if output type is supported

* Replace `frameRateRanges` with `minFps` and `maxFps`

* Remove `isHighestPhotoQualitySupported`

* Remove `colorSpace`

The native platforms will use the best / most accurate colorSpace by default anyways.

* HDR

* Check from format

* fix

* Remove `supportsParallelVideoProcessing`

* Correctly return video/photo sizes on Android now. Finally

* Log all Device props

* Log if optimized usecase is used

* Cleanup

* Configure Camera Input only once

* Revert "Configure Camera Input only once"

This reverts commit 0fd6c03f54c7566cb5592053720c4a8743aba92e.

* Extract Camera configuration

* Try to reconfigure all

* Hook based

* Properly set up `CameraSession`

* Delete unused

* fix: Fix recreate when outputs change

* Update NativePreviewView.kt

* Use callback for closing

* Catch CameraAccessException

* Finally got it stable

* Remove isMirrored

* Implement `takePhoto()`

* Add ExifInterface library

* Run findViewById on UI Thread

* Add Photo Output Surface to takePhoto

* Fix Video Stabilization Modes

* Optimize Imports

* More logs

* Update CameraSession.kt

* Close Image

* Use separate Executor in CameraQueue

* Delete hooks

* Use same Thread again

* If opened, call error

* Update CameraSession.kt

* Log HW level

* fix: Don't enable Stream Use Case if it's not 100% supported

* Move some stuff

* Cleanup PhotoOutputSynchronizer

* Try just open in suspend fun

* Some synchronization fixes

* fix logs

* Update CameraDevice+createCaptureSession.kt

* Update CameraDevice+createCaptureSession.kt

* fixes

* fix: Use Snapshot Template for speed capture prio

* Use PREVIEW template for repeating request

* Use `TEMPLATE_RECORD` if video use-case is attached

* Use `isRunning` flag

* Recreate session everytime on active/inactive

* Lazily get values in capture session

* Stability

* Rebuild session if outputs change

* Set `didOutputsChange` back to false

* Capture first in lock

* Try

* kinda fix it? idk

* fix: Keep Outputs

* Refactor into single method

* Update CameraView.kt

* Use Enums for type safety

* Implement Orientation (I think)

* Move RefCount management to Java (Frame)

* Don't crash when dropping a Frame

* Prefer Devices with higher max resolution

* Prefer multi-cams

* Use FastImage for Media Page

* Return orientation in takePhoto()

* Load orientation from EXIF Data

* Add `isMirrored` props and documentation for PhotoFile

* fix: Return `not-determined` on Android

* Update CameraViewModule.kt

* chore: Upgrade packages

* fix: Fix Metro Config

* Cleanup config

* Properly mirror Images on save

* Prepare MediaRecorder

* Start/Stop MediaRecorder

* Remove `takeSnapshot()`

It no longer works on Android and never worked on iOS. Users could use useFrameProcessor to take a Snapshot

* Use `MediaCodec`

* Move to `VideoRecording` class

* Cleanup Snapshot

* Create `SkiaPreviewView` hybrid class

* Create OpenGL context

* Create `SkiaPreviewView`

* Fix texture creation missing context

* Draw red frame

* Somehow get it working

* Add Skia CMake setup

* Start looping

* Init OpenGL

* Refactor into `SkiaRenderer`

* Cleanup PreviewSize

* Set up

* Only re-render UI if there is a new Frame

* Preview

* Fix init

* Try rendering Preview

* Update SkiaPreviewView.kt

* Log version

* Try using Skia (fail)

* Drawwwww!!!!!!!!!! 🎉

* Use Preview Size

* Clear first

* Refactor into SkiaRenderer

* Add `previewType: "none"` on iOS

* Simplify a lot

* Draw Camera? For some reason? I have no idea anymore

* Fix OpenGL errors

* Got it kinda working again?

* Actually draw Frame woah

* Clean up code

* Cleanup

* Update on main

* Synchronize render calls

* holy shit

* Update SkiaRenderer.cpp

* Update SkiaRenderer.cpp

* Refactor

* Update SkiaRenderer.cpp

* Check for `NO_INPUT_TEXTURE`^

* Post & Wait

* Set input size

* Add Video back again

* Allow session without preview

* Convert JPEG to byte[]

* feat: Use `ImageReader` and use YUV Image Buffers in Skia Context (#1689)

* Try to pass YUV Buffers as Pixmaps

* Create pixmap!

* Clean up

* Render to preview

* Only render if we have an output surface

* Update SkiaRenderer.cpp

* Fix Y+U+V sampling code

* Cleanup

* Fix Semaphore 0

* Use 4:2:0 YUV again idk

* Update SkiaRenderer.h

* Set minSdk to 26

* Set surface

* Revert "Set minSdk to 26"

This reverts commit c4085b7c16c628532e5c2d68cf7ed11c751d0b48.

* Set previewType

* feat: Video Recording with Camera2 (#1691)

* Rename

* Update CameraSession.kt

* Use `SurfaceHolder` instead of `SurfaceView` for output

* Update CameraOutputs.kt

* Update CameraSession.kt

* fix: Fix crash when Preview is null

* Check if snapshot capture is supported

* Update RecordingSession.kt

* S

* Use `MediaRecorder`

* Make audio optional

* Add Torch

* Output duration

* Update RecordingSession.kt

* Start RecordingSession

* logs

* More log

* Base for preparing pass-through Recording

* Use `ImageWriter` to append Images to the Recording Surface

* Stream PRIVATE GPU_SAMPLED_IMAGE Images

* Add flags

* Close session on stop

* Allow customizing `videoCodec` and `fileType`

* Enable Torch

* Fix Torch Mode

* Fix comparing outputs with hashCode

* Update CameraSession.kt

* Correctly pass along Frame Processor

* fix: Use AUDIO_BIT_RATE of 16 * 44,1Khz

* Use CAMCORDER instead of MIC microphone

* Use 1 channel

* fix: Use `Orientation`

* Add `native` PixelFormat

* Update iOS to latest Skia integration

* feat: Add `pixelFormat` property to Camera

* Catch error in configureSession

* Fix JPEG format

* Clean up best match finder

* Update CameraDeviceDetails.kt

* Clamp sizes by maximum CamcorderProfile size

* Remove `getAvailableVideoCodecs`

* chore: release 3.0.0-rc.5

* Use maximum video size of RECORD as default

* Update CameraDeviceDetails.kt

* Add a todo

* Add JSON device to issue report

* Prefer `full` devices and flash

* Lock to 30 FPS on Samsung

* Implement Zoom

* Refactor

* Format -> PixelFormat

* fix: Feat `pixelFormat` -> `pixelFormats`

* Update TROUBLESHOOTING.mdx

* Format

* fix: Implement `zoom` for Photo Capture

* fix: Don't run if `isActive` is `false`

* fix: Call `examplePlugin(frame)`

* fix: Fix Flash

* fix: Use `react-native-worklets-core`!

* fix: Fix import
2023-08-21 12:50:14 +02:00

279 lines
13 KiB
Plaintext

---
id: frame-processors
title: Frame Processors
sidebar_label: Frame Processors
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import useBaseUrl from '@docusaurus/useBaseUrl';
<div>
<svg xmlns="http://www.w3.org/2000/svg" width="283" height="535" style={{ float: 'right' }}>
<image href={useBaseUrl("img/frame-processors.gif")} x="18" y="33" width="247" height="469" />
<image href={useBaseUrl("img/frame.png")} width="283" height="535" />
</svg>
</div>
### What are frame processors?
Frame processors are functions that are written in JavaScript (or TypeScript) which can be used to **process frames the camera "sees"**.
Inside those functions you can call **Frame Processor Plugins**, which are high performance native functions specifically designed for certain use-cases.
For example, you might want to create a [Hotdog/Not Hotdog detector app](https://apps.apple.com/us/app/not-hotdog/id1212457521) **without writing any native code**, while still **achieving native performance**:
```jsx
function App() {
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const isHotdog = detectIsHotdog(frame)
console.log(isHotdog ? "Hotdog!" : "Not Hotdog.")
}, [])
return (
<Camera
{...cameraProps}
frameProcessor={frameProcessor}
/>
)
}
```
Frame processors are by far not limited to Hotdog detection, other examples include:
* **AI** for **facial recognition**
* **AI** for **object detection**
* Using **Tensorflow**, **MLKit Vision**, **Apple Vision** or other libraries
* Creating **realtime video-chats** using **WebRTC** to directly send the camera frames over the network
* Creating scanners for **QR codes**, **Barcodes** or even custom codes such as **Snapchat's SnapCodes** or **Apple's AppClips**
* Creating **snapchat-like filters**, e.g. draw a dog-mask filter over the user's face
* Creating **color filters** with depth-detection
* Drawing boxes, text, overlays, or colors on the screen in realtime
* Rendering filters and shaders such as Blur, inverted colors, beauty filter, or more on the screen
Because they are written in JS, Frame Processors are **simple**, **powerful**, **extensible** and **easy to create** while still running at **native performance**. (Frame Processors can run up to **1000 times a second**!) Also, you can use **fast-refresh** to quickly see changes while developing or publish [over-the-air updates](https://github.com/microsoft/react-native-code-push) to tweak the Hotdog detector's sensitivity in live apps without pushing a native update.
:::note
Frame Processors require [**react-native-worklets-core**](https://github.com/chrfalch/react-native-worklets-core) 1.0.0 or higher.
:::
### Interacting with Frame Processors
Since Frame Processors run in [**Worklets**](https://docs.swmansion.com/react-native-reanimated/docs/fundamentals/worklets), you can directly use JS values such as React state:
```tsx
// can be controlled with a Slider
const [sensitivity, setSensitivity] = useState(0.4)
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const isHotdog = detectIsHotdog(frame, sensitivity)
console.log(isHotdog ? "Hotdog!" : "Not Hotdog.")
}, [sensitivity])
```
You can also easily read from, and assign to [**Shared Values**](https://docs.swmansion.com/react-native-reanimated/docs/fundamentals/shared-values) to create smooth, 60 FPS animations.
In this example, we detect a cat in the frame - if a cat was found, we assign the `catBounds` Shared Value to the coordinates of the cat (relative to the screen) which we can then use in a `useAnimatedStyle` hook to position the red rectangle surrounding the cat. This updates in realtime on the UI Thread, and can also be smoothly animated with `withSpring` or `withTiming`.
```tsx {7}
// represents position of the cat on the screen 🐈
const catBounds = useSharedValue({ top: 0, left: 0, right: 0, bottom: 0 })
// continously sets 'catBounds' to the cat's coordinates on screen
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
catBounds.value = scanFrameForCat(frame)
}, [catBounds])
// uses 'catBounds' to position the red rectangle on screen.
// smoothly updates on UI thread whenever 'catBounds' is changed
const boxOverlayStyle = useAnimatedStyle(() => ({
position: 'absolute',
borderWidth: 1,
borderColor: 'red',
...catBounds.value
}), [catBounds])
return (
<View>
<Camera {...cameraProps} frameProcessor={frameProcessor} />
// draws a red rectangle on the screen which surrounds the cat
<Reanimated.View style={boxOverlayStyle} />
</View>
)
```
And you can also call back to the React-JS thread by using `createRunInJsFn(...)`:
```tsx {1}
const onQRCodeDetected = Worklets.createRunInJsFn((qrCode: string) => {
navigation.push("ProductPage", { productId: qrCode })
})
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const qrCodes = scanQRCodes(frame)
if (qrCodes.length > 0) {
onQRCodeDetected(qrCodes[0])
}
}, [onQRCodeDetected])
```
### Running asynchronously
Since Frame Processors run synchronously with the Camera Pipeline, anything taking longer than one Frame interval might block the Camera from streaming new Frames. To avoid this, you can use `runAsync` to run code asynchronously on a different Thread:
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
console.log('I'm running synchronously at 60 FPS!')
runAsync(() => {
'worklet'
console.log('I'm running asynchronously, possibly at a lower FPS rate!')
})
}, [])
```
### Running at a throttled FPS rate
Some Frame Processor Plugins don't need to run on every Frame, for example a Frame Processor that detects the brightness in a Frame only needs to run twice per second:
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
console.log('I'm running synchronously at 60 FPS!')
runAtTargetFps(2, () => {
'worklet'
console.log('I'm running synchronously at 2 FPS!')
})
}, [])
```
### Using Frame Processor Plugins
Frame Processor Plugins are distributed through npm. To install the [**vision-camera-image-labeler**](https://github.com/mrousavy/vision-camera-image-labeler) plugin, run:
```bash
npm i vision-camera-image-labeler
cd ios && pod install
```
That's it! 🎉 Now you can use it:
```ts
const frameProcessor = useFrameProcessor((frame) => {
'worklet'
const labels = labelImage(frame)
// ...
}, [])
```
Check out [**Frame Processor community plugins**](/docs/guides/frame-processor-plugin-list) to discover plugins, or [**start creating a plugin yourself**](/docs/guides/frame-processors-plugins-overview)!
### Selecting a Format for a Frame Processor
When running frame processors, it is often important to choose an appropriate [format](/docs/guides/formats). Here are some general tips to consider:
* If you are running heavy AI/ML calculations in your frame processor, make sure to [select a format](/docs/guides/formats) that has a lower resolution to optimize it's performance.
* Sometimes a frame processor plugin only works with specific [pixel formats](/docs/api/interfaces/CameraDeviceFormat#pixelformat). Some plugins (like MLKit) don't work with `x420`.
### Benchmarks
Frame Processors are _really_ fast. I have used [MLKit Vision Image Labeling](https://firebase.google.com/docs/ml-kit/ios/label-images) to label 4k Camera frames in realtime, and measured the following results:
* Fully natively (written in pure Objective-C, no React interaction at all), I have measured an average of **68ms** per call.
* As a Frame Processor Plugin (written in Objective-C, called through a JS Frame Processor function), I have measured an average of **69ms** per call.
This means that **the Frame Processor API only takes ~1ms longer than a fully native implementation**, making it **the fastest and easiest way to run any sort of Frame Processing in React Native**.
> All measurements are recorded on an iPhone 11 Pro, benchmarked total execution time of the [`captureOutput`](https://developer.apple.com/documentation/avfoundation/avcapturevideodataoutputsamplebufferdelegate/1385775-captureoutput) function by using [`CFAbsoluteTimeGetCurrent`](https://developer.apple.com/documentation/corefoundation/1543542-cfabsolutetimegetcurrent). Running smaller images (lower than 4k resolution) is much quicker and many algorithms can even run at 60 FPS.
### Avoiding Frame-drops
Frame Processors will be **synchronously** called for each frame the Camera sees and have to finish executing before the next frame arrives, otherwise the next frame(s) will be dropped. For a frame rate of **30 FPS**, you have about **33ms** to finish processing frames. For a QR Code Scanner, **5 FPS** (200ms) might suffice, while a object tracking AI might run at the same frame rate as the Camera itself (e.g. **60 FPS** (16ms)).
### ESLint react-hooks plugin
If you are using the [react-hooks ESLint plugin](https://www.npmjs.com/package/eslint-plugin-react-hooks), make sure to add `useFrameProcessor` to `additionalHooks` inside your ESLint config. (See ["advanced configuration"](https://www.npmjs.com/package/eslint-plugin-react-hooks#advanced-configuration))
### Technical
#### Frame Processors
**Frame Processors** are JS functions that will be **workletized** using [react-native-worklets-core](https://github.com/chrfalch/react-native-worklets-core). They are created on a **parallel camera thread** using a separate JavaScript Runtime (_"VisionCamera JS-Runtime"_) and are **invoked synchronously** (using JSI) without ever going over the bridge. In a **Frame Processor** you can write normal JS code, call back to the React-JS Thread (e.g. `setState`), use [Shared Values](https://docs.swmansion.com/react-native-reanimated/docs/fundamentals/shared-values/) and call **Frame Processor Plugins**.
> See [**the example Frame Processor**](https://github.com/mrousavy/react-native-vision-camera/blob/cf68a4c6476d085ec48fc424a53a96962e0c33f9/example/src/CameraPage.tsx#L199-L203)
#### Frame Processor Plugins
**Frame Processor Plugins** are native functions (written in Objective-C, Swift, C++, Java or Kotlin) that are injected into the VisionCamera JS-Runtime. They can be **synchronously called** from your JS Frame Processors (using JSI) without ever going over the bridge. Because VisionCamera provides an easy-to-use plugin API, you can easily create a **Frame Processor Plugin** yourself. Some examples include [Barcode Scanning](https://developers.google.com/ml-kit/vision/barcode-scanning), [Face Detection](https://developers.google.com/ml-kit/vision/face-detection), [Image Labeling](https://developers.google.com/ml-kit/vision/image-labeling), [Text Recognition](https://developers.google.com/ml-kit/vision/text-recognition) and more.
> Learn how to [**create Frame Processor Plugins**](frame-processors-plugins-overview), or check out the [**example Frame Processor Plugin for iOS**](https://github.com/mrousavy/react-native-vision-camera/blob/main/example/ios/Frame%20Processor%20Plugins/Example%20Plugin%20(Swift)/ExamplePluginSwift.swift) or [**Android**](https://github.com/mrousavy/react-native-vision-camera/blob/main/example/android/app/src/main/java/com/mrousavy/camera/example/ExampleFrameProcessorPlugin.java).
#### The `Frame` object
The Frame Processor gets called with a `Frame` object, which is a **JSI HostObject**. It holds a reference to the native (C++) Frame Image Buffer (~10 MB in size) and exposes properties such as `width`, `height`, `bytesPerRow` and more to JavaScript so you can synchronously access them. The `Frame` object can be passed around in JS, as well as returned from- and passed to a native **Frame Processor Plugin**.
> See [**this tweet**](https://twitter.com/mrousavy/status/1412300883149393921) for more information.
### Disabling Frame Processors
The Frame Processor API spawns a secondary JavaScript Runtime which consumes a small amount of extra CPU and RAM. Additionally, compile time increases since Frame Processors are written in native C++. If you're not using Frame Processors at all, you can disable them:
<Tabs
groupId="environment"
defaultValue="rn"
values={[
{label: 'React Native', value: 'rn'},
{label: 'Expo', value: 'expo'}
]}>
<TabItem value="rn">
#### Android
Inside your `gradle.properties` file, add the `disableFrameProcessors` flag:
```groovy
disableFrameProcessors=true
```
Then, clean and rebuild your project.
#### iOS
Inside your `Podfile`, add the `VCDisableFrameProcessors` flag:
```ruby
$VCDisableFrameProcessors = true
```
</TabItem>
<TabItem value="expo">
Inside your Expo config (`app.json`, `app.config.json` or `app.config.js`), add the `disableFrameProcessors` flag to the `react-native-vision-camera` plugin:
```json
{
"name": "my app",
"plugins": [
[
"react-native-vision-camera",
{
// ...
"disableFrameProcessors": true
}
]
]
}
```
</TabItem>
</Tabs>
<br />
#### 🚀 Next section: [Zooming](/docs/guides/zooming) (or [creating a Frame Processor Plugin](/docs/guides/frame-processors-plugins-overview))