Transcripts (also referred to as ASR or automatic speech recognition) and Closed Captions are both text versions of the speech from video or audio media.

Closed Captions use the same ASR file as the transcript. With closed captions, the text of what is being spoken is shown on the screen, within the video, in small chunks that correspond to the current location of the video.
Transcripts also provide the text of what is being spoken on the screen, and are divided into small chunks that correspond to the location of the video.
The difference in transcripts is that the entirety of the transcription is shown, with a highlight on the entry that corresponds to the current location. Transcripts are also searchable within the player (because the entire transcript is visible), and users can jump to a particular location in the transcript, which automatically jumps to that location in the video.
The image on the right shows the transcript button and transcript panel for a video, as well as the closed caption button and caption banner, and both are showing the same text for the current location in the video.
The text is often the same, but can differ because closed captions are typically (and by design) more accurate. As noted later on this page, however, transcripts can be applied as closed captions to media, in which case the text in both would be identical.