Getting Started with Nexmo In-App Voice for JavaScript

In this guide we'll cover adding audio events to the Conversation we have created in the simple conversation with events guide. We'll deal with sending and receiving media events to and from the conversation.

Concepts

This guide will introduce you to the following concepts:

  • Audio Stream - The stream that the SDK gives you in your browser to listen to audio and send audio
  • Audio Leg - A server side API term. Legs are a part of a conversation. When audio is enabled on a conversation, a leg is created
  • Media Event - a member:media event that fires on a Conversation when the media state changes for a member

Before you begin

1 - Update the JavaScript App

We will use the application we created for the third getting started guide. All the basic setup has been done in the previous guides and should be in place. We can now focus on updating the client-side application.

1.1 - Add audio UI

First, we'll add the UI for the user to enable and disable audio, as well as an <audio> element that we'll use to play the Audio stream from the conversation. Let's add the UI at the top of the messages area.

<section id="messages">
  <div>
    <audio id="audio">
      <source>
    </audio>
    <button id="enable">Enable Audio</button>
    <button id="disable">Disable Audio</button>
  </div>
  ...
</section>

And add the buttons and <audio> element in the class constructor

constructor() {
...
  this.audio = document.getElementById('audio')
  this.enableButton = document.getElementById('enable')
  this.disableButton = document.getElementById('disable')
}

1.2 - Add enable audio handler

We'll then update the setupUserEvents method to trigger conversation.media.enable() when the user clicks the Enable Audio button. The conversation.media.enable() returns a promise with a stream object, which we'll use as the source for our <audio> element. We'll then add a listener on the <audio> element to start playing as soon as the metadata has been loaded.

setupUserEvents() {
...
  this.enableButton.addEventListener('click', () => {
    this.conversation.media.enable().then(stream => {
      // Older browsers may not have srcObject
      if ("srcObject" in this.audio) {
        this.audio.srcObject = stream;
      } else {
        // Avoid using this in new browsers, as it is going away.
        this.audio.src = window.URL.createObjectURL(stream);
      }

      this.audio.onloadedmetadata = () => {
        this.audio.play();
      }

      this.eventLogger('member:media')()
    }).catch(this.errorLogger)
  })
}

Note that enabling audio in a conversation establishes an audio leg for a member of the conversation. The audio is only streamed to other members of the conversation who have also enabled audio.

1.3 - Add disable audio handler

Next, we'll add the ability for a user to disable the audio stream as well. In order to do this, we'll update the setupUserEvents method to trigger conversation.media.disable() when the user clicks the Disable Audio button.

setupUserEvents() {
...
  this.disableButton.addEventListener('click', () => {
    this.conversation.media.disable().then(this.eventLogger('member:media')).catch(this.errorLogger)
  })
}

1.4 - Add member:media listener

With these first parts we're sending member:media events into the conversation. Now we're going to register a listener for them as well that updates the messageFeed. In order to do that, we'll add a listener for member:media events at the end of the setupConversationEvents method

setupConversationEvents(conversation) {
  ...

  conversation.on("member:media", (member, event) => {
    console.log(`*** Member changed media state`, member, event)
    const text = `${member.user.name} <b>${event.body.audio ? 'enabled' : 'disabled'} audio in the conversation</b><br>`
    this.messageFeed.innerHTML = text + this.messageFeed.innerHTML
  })

}

If we want the conversation history to be updated, we need to add a case for member:media in the showConversationHistory switch:

showConversationHistory(conversation) {
  ...
  switch (value.type) {
    ...
    case 'member:media':
      eventsHistory = `${conversation.members.get(value.from).user.name} @ ${date}: <b>${value.body.audio ? "enabled" : "disabled"} audio</b><br>` + eventsHistory
      break;
    ...
  }
}

1.5 - Open the conversation in two browser windows

Now run index.html in two side-by-side browser windows, making sure to login with the user name jamie in one and with alice in the other. Enable audio on both and start talking. You'll also see events being logged in the browser console.

That's it! Your page should now look something like this  .

Where next?

  • The next guide covers how to easily call users with the convenience method call(). This method offers an easy to use alternative for creating a conversation, inviting users and manually enabling their audio streams.
  • Have a look at the Nexmo Stitch JavaScript SDK API Reference

Getting Started with Nexmo In-App Voice for Android

In this getting started guide we'll cover adding audio events to the Conversation we created in the previous quickstarts. We'll deal with media events, the ones that come via the conversation, and the ones we send to the conversation.

Concepts

This guide will introduce you to the following concepts:

  • Audio Stream - The stream that the SDK gives you in your browser to listen to audio and send audio
  • Audio Leg - A server side API term. Legs are a part of a conversation. When audio is enabled on a conversation, a leg is created
  • MemberMedia - MemberMedia events that fire on a Conversation when media state changes for a member

Before you begin

  • Run through the previous quickstarts
  • If you're continuing on from the previous guide you may need to regenerate the users JWTs. See quickstarts 1 and 2 for how to do so.

1 - Update the Android App

We will use the application we created for the previous quickstarts. All the basic setup has been done in the previous guides and should be in place. We can now focus on updating the client-side application.

1.1 - Update permissions in AndroidManifest

Since we'll be working with audio, we need to add the necessary permissions to the app.

Add the following to your AndroidManifest

<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

1.2 - Create an audio button

We want our users to be able to enable and disable audio at will. So in the ChatActivity we'll add a button in the options menu that will enable and disable audio.

// ChatActivity.java
@Override
public boolean onCreateOptionsMenu(Menu menu) {
    MenuInflater inflater = getMenuInflater();
    inflater.inflate(R.menu.chat_menu, menu);
    return true;
}

@Override
public boolean onOptionsItemSelected(MenuItem item) {
    switch (item.getItemId()) {
        case R.id.audio:
            //TODO: implement
            requestAudio();
            return true;
        default:
            return super.onOptionsItemSelected(item);
    }
}

Our chat_menu will look like this:

<?xml version="1.0" encoding="utf-8"?>
<menu xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto">
    <item android:id="@+id/audio"
        android:title="Audio"
        android:icon="@drawable/ic_record_voice_over_black_24dp"
        app:showAsAction="ifRoom"/>
</menu>

I've added an icon using Vector Assets  but you don't have to.

1.3 - Requesting Audio permissions

Before we can enable In-App Voice in our app we need to check or request the RECORD_AUDIO permission. To check that we have permission, we'll call the ContextCompat.checkSelfPermission() method. If ActivityCompat.shouldShowRequestPermissionRationale() is true, then the user has approved the permission. But if it's false, the permission is denied and we'll show our reasoning to enable it. If we need to request the permission we'll do so with ActivityCompat.requestPermissions(). We'll handle this logic in the requestAudio() method. We'll also need to create a constant PERMISSION_REQUEST_AUDIO to check if the permission was granted or not.

For more info about permissions check out the Android Developers documentation. 

// ChatActivity.java
private static final int PERMISSION_REQUEST_AUDIO = 0;

//rest of activity...

private void requestAudio() {
    if (ContextCompat.checkSelfPermission(ChatActivity.this, RECORD_AUDIO) == PackageManager.PERMISSION_GRANTED) {
        //TODO: implement
        toggleAudio();
    } else {
        if (ActivityCompat.shouldShowRequestPermissionRationale(this, RECORD_AUDIO)) {
            logAndShow("Need permissions granted for Audio to work");
        } else {
            ActivityCompat.requestPermissions(ChatActivity.this, new String[]{RECORD_AUDIO}, PERMISSION_REQUEST_AUDIO);
        }
    }
}

After we ask the user for the RECORD_AUDIO permission we'll get the result of their decision in onRequestPermissionsResult() If they granted it we'll call toggleAudio() to enable/disable audio in the app. If the user didn't grant the permission we'll pop a toast and log out that we need to enable audio permissions to continue.

// ChatActivity.java
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
    switch (requestCode) {
        case PERMISSION_REQUEST_AUDIO: {
            if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
                //TODO: implement
                toggleAudio();
                break;
            } else {
                logAndShow("Enable audio permissions to continue");
                break;
            }
        }
        default: {
            logAndShow("Issue with onRequestPermissionsResult");
            break;
        }
    }
}

1.4 - Enabling and disabling audio

Now we can implement the toggleAudio() method. We'll use a constant AUDIO_ENABLED to note if audio is enabled or not and initialize it to false. When we change the state of audio in the app we'll change the boolean.

At this point, enabling and disabling In-App Voice in your app is as simple as calling conversation.media(Conversation.MEDIA_TYPE.AUDIO).enable() or conversation.media(Conversation.MEDIA_TYPE.AUDIO).disable(). .disable() takes a RequestHandler as an argument with onSuccess() and onError() callbacks.

The .enable() methods takes a AudioCallEventListener with multiple callbacks that handle the state of audio. The audio will enter a onRinging() state, then onCallConnected() when the user has joined the audio channel. If the user disables audio onCallEnded() will fire. If any kind of error occurs, then the onGeneralCallError() callback will fire. onAudioRouteChange() is called when the audio manager reports an audio device change, like when switching from the device's in ear speaker to a wired headset.

// ChatActivity.java
private boolean AUDIO_ENABLED = false;

//rest of activity...

private void toggleAudio() {
    if(AUDIO_ENABLED) {
        conversation.media(Conversation.MEDIA_TYPE.AUDIO).disable(new RequestHandler<Void>() {
            @Override
            public void onError(NexmoAPIError apiError) {
                logAndShow(apiError.getMessage());
            }

            @Override
            public void onSuccess(Void result) {
                AUDIO_ENABLED = false;
                logAndShow("Audio is disabled");
            }
        });
    } else {
        conversation.media(Conversation.MEDIA_TYPE.AUDIO).enable(new AudioCallEventListener() {
            @Override
            public void onRinging() {
                logAndShow("Ringing");
            }

            @Override
            public void onCallConnected() {
                logAndShow("Connected");
                AUDIO_ENABLED = true;
            }

            @Override
            public void onCallEnded() {
                logAndShow("Call Ended");
                AUDIO_ENABLED = false;
            }

            @Override
            public void onGeneralCallError(NexmoAPIError apiError) {
                logAndShow(apiError.getMessage());
                AUDIO_ENABLED = false;
            }

            @Override
            public void onAudioRouteChange(AppRTCAudioManager.AudioDevice device) {
                logAndShow("Audio Route changed");
            }
        });
    }
}

Now we could try out In-App Voice right now by launching the app on two devices and pressing the audio button, but we wouldn't know if the other user enabled audio on their device! In order to know that we need to make some changes to ChatAdapter.java

Note that enabling audio in a conversation establishes an audio leg for a member of the conversation. The audio is only streamed to other members of the conversation who have also enabled audio.

2 - Showing MemberMedia events

In the previous quickstart we added a RecyclerView to our app and showed the chat history by adding ChatAdapter.java. As a refresher, to observe events that happens in a conversation we've tapped into conversation.messageEvent() and added a ResultListener that's fired whenever there's new event. Up until now, the only events we've dealt with are Text. Now we're going to handle any MemberMedia events that get sent to a conversation.

2.1 - Handling MemberMedia events in ChatAdapter.java

We're going to make some changes to the onBindViewHolder() method. Currently we check for Text events like so: if (events.get(position).getType().equals(EventType.TEXT)).

Now we need to add a check for MemberMedia events with an else if.

// ChatAdapter.java

@Override
    public void onBindViewHolder(ChatAdapter.ViewHolder holder, int position) {
        if (events.get(position).getType().equals(EventType.TEXT)) {
          // the current logic for handling Text events
          ...
        } else if (events.get(position).getType().equals(EventType.MEMBER_MEDIA)) {
            final MemberMedia mediaMessage = (MemberMedia) events.get(position);
            holder.text.setText(mediaMessage.getMember().getName() + (mediaMessage.isAudioEnabled() ? " enabled" : " disabled") + " audio.");
            holder.seenIcon.setVisibility(View.INVISIBLE);
        }

After we check that the event equals(EventType.MEMBER_MEDIA) we'll show a message in the adapter that tells who enabled or disabled audio. Don't forget to set the visibility of the seenIcon! We'll just always set it to invisible in this case.

Try it out!

After this you should be able to run the app in two different android devices or emulators. Try enabling or disabling audio and speaking to yourself or a friend!

Note: Don't forget to generate new JWTs for you users if it's been over 24 hours since you last generated the user JWTs.

The next guide covers how to easily call users with the convenience method call(). This method offers an easy to use alternative for creating a conversation, inviting users and manually enabling their audio streams.

View the source code for this example. 

Getting Started with Nexmo In-App Voice for iOS

In this guide we'll cover adding audio events to the Conversation we have created in the simple conversation with events guide. We'll deal with sending and receiving media events to and from the conversation.

Concepts

This guide will introduce you to the following concepts:

  • Audio Stream - The stream that the SDK gives you in your browser to listen to audio and send audio
  • Audio Leg - A server side API term. Legs are a part of a conversation. When audio is enabled on a conversation, a leg is created

Before you begin

  • Ensure you have Node.JS and NPM installed (you'll need it for the Nexmo CLI)
  • Ensure you have Xcode installed
  • Create a free Nexmo account - signup 
  • Install the Nexmo CLI:

    $ npm install -g nexmo-cli@beta
    

    Setup the CLI to use your Nexmo API Key and API Secret. You can get these from the setting page  in the Nexmo Dashboard.

    $ nexmo setup api_key api_secret
    

1.0 - Start a new iOS project

Open Xcode and start a new project. We'll name it "AudioQuickStart".

1.1 Adding the Nexmo Stitch iOS SDK to Cocoapods

Navigate to the project's root directory in the Terminal. Run: pod init. Open the file entitled Podfile. Configure its specifications accordingly:

platform :ios, '10.0'

use_frameworks!

source "https://github.com/Nexmo/PodSpec.git"
source 'git@github.com:CocoaPods/Specs.git'


target 'enable-audio' do

  pod "Nexmo-Stitch" #, :git => "https://github.com/Nexmo/stitch-ios-sdk", :branch => "release" # development

end

1.2 Adding ViewControllers & .storyboard files

Let's add a few view controllers. Start by adding a custom subclass of UIViewController from a CocoaTouch file named LoginViewController, which we will use for creating the login functionality, and another custom subclass of UIViewController from a CocoaTouch file named ChatViewController, which we will use for creating the chat functionality. Add two new scenes to Main.storyboard, assigning each to one of the added custom subclasses of UIViewController respectively.

1.3 Creating the login layout

Let's layout the login functionality. Set constraints on the top & leading attributes of an instance of UIButton with a constant HxW at 71x94 to the top of the Bottom Layout Guide + 20 and the leading attribute of view + 16. This is our login button. Reverse leading to trailing for another instance of UIButton with the same constraints. This our chat button. Set the text on these instances accordingly. Add a status label centered horizontally & vertically. Finally, embed this scene into a navigation controller. Control drag from the chat button to scene assigned to the chat controller, naming the segue chatView.

1.4 - Create the Login Functionality

Below UIKit let's import Stitch. Next we setup a custom instance of the ConversationClient and saving it as a member variable in the view controller.

/// Nexmo Conversation client
let client: ConversationClient = {
    return ConversationClient.instance
}()

We also need to wire up the buttons in LoginViewController.swift Don't forget to replace USER_JWT with the JWT generated from the Nexmo CLI. For a refresher on how to generate a JWT, check out quickstart one.

    // status label
    @IBOutlet weak var statusLbl: UILabel!

    // login button
    @IBAction func loginBtn(_ sender: Any) {

        print("DEMO - login button pressed.")

        let token = Authenticate.userJWT

        print("DEMO - login called on client.")

        client.login(with: token).subscribe(onSuccess: {

            print("DEMO - login susbscribing with token.")
            self.statusLbl.isEnabled = true
            self.statusLbl.text = "Logged in"

            if let user = self.client.account.user {
                print("DEMO - login successful and here is our \(user)")
            } // insert activity indicator to track subscription

        }, onError: { [weak self] error in
            self?.statusLbl.isHidden = false

            print(error.localizedDescription)

            // remove to a function
            let reason: String = {
                switch error {
                case LoginResult.failed: return "failed"
                case LoginResult.invalidToken: return "invalid token"
                case LoginResult.sessionInvalid: return "session invalid"
                case LoginResult.expiredToken: return "expired token"
                case LoginResult.success: return "success"
                default: return "unknown"
                }
            }()

            print("DEMO - login unsuccessful with \(reason)")

        }).addDisposableTo(client.disposeBag) // Rx does not maintain a memory reference; to make sure that reference is still in place; keep a reference of this object while I do an operation.
    }

    // chat button
    @IBAction func chatBtn(_ sender: Any) {

        let aConversation: String = "aConversation"
        _ = client.conversation.new(aConversation, withJoin: true).subscribe(onError: { error in

            print(error)

            guard self.client.account.user != nil else {

                let alert = UIAlertController(title: "LOGIN", message: "The `.user` property on self.client.account is nil", preferredStyle: .alert)

                let alertAction = UIAlertAction(title: "OK", style: .default, handler: nil)

                alert.addAction(alertAction)

                self.present(alert, animated: true, completion: nil)

                return print("DEMO - chat self.client.account.user is nil");

            }

            print("DEMO - chat creation unsuccessful with \(error.localizedDescription)")

        })

        performSegue(withIdentifier: "chatView", sender: nil)
    }

1.5 Stubbed Out Login

Next, let's stub out the login workflow.

Create an authenticate struct with a member set as userJWT. For now, stub it out to always return the value for USER_JWT.

// a stub for holding the value for private.key
struct Authenticate {

    static let userJWT = ""

}

After the user logs in, they'll press the "Chat" button which will take them to the ChatViewController and let them begin chatting in the conversation we've already created.

1.6 Navigate to ChatViewController

As we mentioned above, creating a conversation results from a call to the new() method. In the absence of a server we’ll 'simulate' the creation of a conversation within the app when the user clicks the chatBtn.

When we construct the segue for ChatViewController, we pass the first conversation so that the new controller. Remember that the CONVERSATION_ID comes from the id generated in the first quickstart.

// prepare(for segue:)
override func prepare(for segue: UIStoryboardSegue, sender: Any?) {

    // setting up a segue
    let chatVC = segue.destination as? ChatController

    // passing a reference to the conversation
    chatVC?.conversation = client.conversation.conversations.first

}

1.7 Create the Chat layout

We'll make a ChatActivity with this as the layout. Add an instance of UITextView, UITextField, & UIButton.Set the constraints on UITextView with setting its constraints: .trailing = trailingMargin, .leading = Text Field.leading, .top = Top Layout Guide.bottom, .bottom + 15 = Text Field.top. Set the leading attribute on the Text Field to = leadingMargin and its .bottom attribute + 20 to Bottom Layout Guide's top attribute. Set the Button's .trailing to trailingMargin + 12 and its .bottom attribute + 20 to the Bottom Layout Guide's .top attribute.

1.8 Create the ChatActivity

Like last time we'll wire up the views in ChatViewController.swift We also need to grab the reference to conversation from the incoming view controller.


import UIKit
import NexmoConversation

class ChatController: UIViewController {

    // conversation for passing client
    var conversation: Conversation?

    // textView for displaying chat
    @IBOutlet weak var textView: UITextView!

    // textField for capturing text
    @IBOutlet weak var textField: UITextField!

}

1.9 - Sending and receiving text Events

To send a message we simply need to call send() on our instance of conversation. send() takes one argument, a String message.

// sendBtn for sending text
@IBAction func sendBtn(_ sender: Any) {

    do {
        // send method
        try conversation?.send(textField.text!)

    } catch let error {
        print(error)
    }

}

In viewDidLoad() we want to add a handler for handling new events like the TextEvents we create when we press the send button. We can do this like so:

conversation?.events.newEventReceived.subscribe(onSuccess: { event in
   guard let event = event as? TextEvent, event.isCurrentlyBeingSent == false else { return }
   guard let text = event.text else { return }
   self.textView.insertText(" \(text) \n ")
})

2.0 - Building Audio

Since we will be tapping into protected device functionality we will have to ask for permission. We will update our .plist as well as display an alert. After permissions we will add AVFoundation class, set up audio from within the SDK and add a speaker emoji for our UI 🔈

2.1 Xcode Permission

Open up the raw version of the .plist. Drop the following lines of code in there.

<key>NSMicrophoneUsageDescription</key>
    <string>audio call permission</string>

2.2 User Permission

Add the AVFoundation library: swift import AVFoundation

Create a setupAudio() function:

private func setupAudio() {
    do {
        let session = AVAudioSession.sharedInstance()

        try session.setCategory(AVAudioSessionCategoryPlayAndRecord)
        session.requestRecordPermission { _ in }
    } catch  {
        print(error)
    }
}

2.3 Enable / Disable

To add functionality for enable / disable, we simple create functions that call the .enable() or .disable() methods on media property of our instance of the conversation client like so down below in sections 2.3.1 and 2.3.2

Note that enabling audio in a conversation establishes an audio leg for a member of the conversation. The audio is only streamed to other members of the conversation who have also enabled audio.

2.3.1 Enable

Create a function for enable.

private func enable() {
    do {
        try self.conversation?.media.enable()
    } catch let error {
        self.getView.state.text = "failed: \(error)"
    }
}

2.3.2 Disable

Create a function for disable.

@IBAction internal func disable() {
    conversation?.media.disable()

    self.navigationController?.popViewController(animated: true)
}

2.4 Speaker Emoji for UI

Let's use a speaker emoji for our UI. Drag and drop a UIButton on the left hand side of the UITextField. Control click to drag an action onto ViewController.Swift. Name the function like so:


  @IBAction func phoneButtonPressed(_ sender: UIButton) {

    do {
        try conversation?.media.enable()
        sender.titleLabel?.text = "🔇"
    } catch {
        conversation?.media.disable()
        sender.titleLabel?.text = "🔈"
    }

  }

Configure the text property on the button's text label to display either speaker 🔈 for enabled audio or else mute 🔇 for disabled audio.

2.5 Console logs

By implementing our enable / disable functions, we will see the updates right there inside of Xcode in the console log.

Try it out!

After this you should be able to run the app and enable / disable audio. Try it out here  !

The next guide covers how to easily call users with the convenience method call(). This method offers an easy to use alternative for creating a conversation, inviting users and manually enabling their audio streams.