Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

예제로 배우는 러스트 (Rust by Example) 한국어판

러스트는 안전성과 속도 그리고, 병렬 처리에 초점을 맞춘 최신 시스템 프로그래밍 언어 입니다. 러스트는 이를 위해 가비지 컬렉션 기술을 사용하지 않고 메모리 안전성을 지원합니다.

이 문서는 실행 가능한 예제들로 러스트의 여러가지 개념과 표준 라이브러리를 소개합니다. 예제들을 사용하려면 로컬에 러스트를 설치하고 공식 문서도 읽어보기 바랍니다. 관심있는 분은 이 문서의 소스도 보아주세요.

(역주: 보고 계신 한글판의 번역은 여기에서 진행하고 있습니다.)

이제 시작할까요!

  • 인사하기 - 전통의 Hello World 부터 만들어봅시다.

  • 기본 자료형 - 부호있는 정수형과 부호없는 정수형, 기타 기본 자료형들에 대해 배웁시다.

  • 사용자 정의 자료형 - structenum.

  • 변수 바인딩 - 수정 가능한 변수와 스코프, 섀도잉.

  • 자료형 - 타입을 변경하고 정의하는 방법에 대해 배워 봅니다.

  • 형변환 - String, integer, float와 같은 서로 다른 타입으로 변환해 봅니다.

  • 표현식 - 표현식과 이를 이용하는 방법에 대해 배워 봅니다.

  • 제어문 - if/else, for, 그리고 여러 다른 것들.

  • 함수 - 메서드, 클로저와 상위 순서의 함수를 알아봅시다.

  • 모듈 - 모듈을 통해 코드를 정리해봅시다.

  • 크레이트(Crate) - 크레이트는 러스트의 컴파일 단위입니다. 라이브러리도 만들어봅시다.

  • 카고(Cargo) - 러스트의 공식 패키지 관리 툴의 기본 기능을 알아봅시다.

  • 속성 - 속성은 모듈, 크레이트 또는 아이템에 적용하는 메타데이터입니다.

  • 제네릭 - 여러 타입의 매개변수를 가지는 함수나 데이터 타입을 작성하는 법을 알아봅시다.

  • 스코프 규칙 - 스코프는 소유권, 빌리기와 수명 주기에 중요한 역할을 합니다.

  • 트레잇 - 트레잇은 알 수 없는 타입 Self를 위해 선언된 메서드 모음입니다.

  • 매크로 - 매크로는 다른 코드를 작성하는 코드로, 메타프로그래밍이라고 부르기도 합니다.

  • 오류 다루기 - 러스트에서 실패를 다루는 법을 배워봅니다.

  • std 라이브러리 타입 - std 라이브러리에서 제공하는 몇몇 커스텀 타입을 알아봅시다.

  • std 라이브러리 잡동사니 - 파일 핸들링과 스레드에 대한 더 많은 커스텀 타입입니다.

  • 테스트하기 - 러스트에서의 모든 방식의 테스트.

  • Unsafe한 작업 - "안전하지 않은" 동작 블록에 진입하는 방법을 알아봅니다.

  • 호환성 - 러스트의 발전과 가능한 호환성 문제를 다룹니다.

  • 메타데이터 - 문서화와 벤치마크.

인사하기

다음은 전통의 Hello World 프로그램 소스입니다.

// 이 부분은 코멘트라서 컴파일러는 관여하지 않습니다.
// 오른쪽에 "Run" 버튼을 클릭해서 테스트할 수 있습니다.
// 키보드 사용하는 쪽이 편하시면 "Ctrl + Enter" 로 실행 할 수 있습니다.

// 이 코드는 수정하실 수 있습니다. 마음껏 고쳐보세요!
// 원래 코드로 되돌리려면 오른쪽의 "Undo" 버튼을 클릭하세요.

// 다음은 main 함수입니다.
fn main() {
    // 컴파일하고 실행하면, 이곳의 문장들이 실행됩니다.

    // 콘솔에 문자열을 출력합니다.
    println!("Hello World!");
}

println! 은 문자열을 콘솔에 출력하는 macro 입니다.

실행 파일은 러스트 컴파일러 rustc로 만들 수 있습니다.

$ rustc hello.rs

rustc가 실행 파일 hello를 만들어 줄 겁니다.

$ ./hello
Hello World!

실습

페이지 위쪽의 프로그램 상자에서 "Run" 을 클릭하면 어떤 내용이 출력되는지 확인합니다. 그리고, 한 줄을 추가하고 println! 매크로를 한 번 더 사용해서 아래 문자열이 출력되도록 해보세요.

Hello World!
I'm a Rustacean!

코멘트

모든 프로그램에는 코멘트가 필요합니다. 러스트는 이를 위해 몇가지 문법을 제공합니다.

  • 일반 코멘트를 사용하면 컴파일러가 안쪽의 내용을 무시해줍니다. :
    • // 해당 줄의 끝까지 코멘트가 됩니다.
    • /* 둘러싼 부분이 코멘트가 됩니다. */
  • 문서 코멘트라이브러리 문서의 생성에 사용됩니다. :
    • /// 이 줄 다음에 오는 항목의 문서를 생성합니다.
    • //! 이 줄을 포함한 항목의 문서를 생성합니다.
fn main() {
    // 이것은 한 줄 코멘트입니다.
    // 슬래시 두 개를 맨 앞에 넣으면 됩니다.
    // 컴파일러는 코멘트 안에 있는 것은 아무것도 읽지 않습니다.

    // println!("Hello, world!");

    // 실행해보세요. 뭐가 보이나요? 윗줄에서 슬래시 두개를 지우고 다시 실행해보면요?

    /* 
     * 이것은 블럭 코멘트입니다. 일반적으로는 한 줄 코멘트를 많이 사용합니다. 하지만
     * 임시로 코드를 막을 때에는 블럭 코멘트가 아주 편리합니다.
     * /* 블럭 코멘트는 /* 중첩 */ 될 수 있습니다. */
     * 이 main 함수 안의 내용들을 모두 감싸는 것도 몇번의 타이핑이면 충분합니다.
     * /*/*/* 직접 해보세요! */*/*/
     */

    /*
    주의: 위의 블럭 코멘트 앞쪽에 `*` 컬럼은 보기 좋으라고 넣은 것입니다. 
    코멘트앞에 반드시 '*' 를 넣어야 하는 것은 아닙니다.
    */

    // 표현식을 다룰 때는 블럭 코멘트가 한 줄 코멘트보다 편리합니다.
    // 다음에서 코멘트를 삭제하고 결과가 어떻게 바뀌는지 확인해보세요.
    let x = 5 + /* 90 + */ 5;
    println!("x 는 10 인가 아니면 100 인가? x 는 {} 이다.", x);
}

참고:

라이브러리 문서

형식을 지정하는 출력

러스트에서 출력 관련 기능은 std::fmt에 정의된 몇 개의 macro로 처리합니다.

  • format!: 형식 지정 문자열을 String 에 출력합니다.
  • print!: format! 과 동일하지만, 출력을 콘솔 (io::stdout) 에 합니다.
  • println!: print! 과 동일하지만, 개행 문자를 덧붙여줍니다.
  • eprint!: format! 과 동일하지만, 표준 오류 스트림 (io::stderr) 에 출력합니다.
  • eprintln!: eprint!과 동일하지만, 개행 문자를 덧붙여줍니다.

이들은 모두 동일한 형식 지정자를 사용합니다. 러스트는 컴파일 시점에 형식 지정이 올바른지 여부도 검사합니다.

fn main() {
    // 일반적으로, {}는 자료형에 관계없이 주어진 인자로 대치됩니다.
    // 이때, 인자들은 문자열로 변환됩니다.
    println!("{} 번째 날", 31);

    // 자료형 표시(suffix)가 없으므로 31은 i32 형이 됩니다. 31 의 자료형을 바꾸려면
    // 자료형 표시(suffix)를 해주면 됩니다. 31i64 라고 하면 i64 형이 됩니다.

    // 중괄호를 이용해서 다양한 방법으로 형식을 지정할 수 있는데, 다음처럼
    // 인자의 위치를 이용할 수 있습니다.
    println!("{0}야 얘는 내 친구 {1}야. {1}야 이쪽은 {0}야", "영희", "철수");

    // 이름을 이용할 수도 있습니다.
    println!("{subject} {verb} {object}",
             object="the lazy dog",
             subject="the quick brown fox",
             verb="jumps over");

    // 역주) 한글도 잘 됩니다.
    println!("{주어} {목적어} {동사}.", 동사="한다", 주어="내가", 목적어="코딩을");

    // `:` 의 뒤에 특별한 형식을 지정할 수도 있습니다.
    println!("{:b}명의 사람들 중 {}명 만이 이진법을 알고, 나머지 반은 모른다", 2, 1);
    println!("10진수:                {}",   69420); // 69420
    println!(" 2진수 (binary):       {:b}", 69420); // 10000111100101100
    println!(" 8진수 (octal):        {:o}", 69420); // 207454
    println!("16진수 (hexadecimal):  {:x}", 69420); // 10f2c

    // 폭을 지정해서 오른편 정렬을 할 수도 있습니다. 다음 코드의 출력은
    // "     1" 이 됩니다. 즉, 공백문자 5개가 나온 후에 "1"이 출력됩니다.
    println!("{number:>width$}", number=1, width=6);
    //println!("{number:>5}", number=1);

    // 공백대신 숫자 0을 넣을 수도 있습니다. 다음 코드는 "000001"을 출력합니다.
    println!("{number:0>width$}", number=1, width=6);
    //println!("{number:0>5}", number=1); // 00001
    // 반대쪽에 넣을 수도 있습니다. 다음 코드는 "100000"을 출력합니다.
    println!("{number:0<width$}", number=1, width=6);
    //println!("{number:0<5}", number=1); // 10000

    // 역주) 역시 한글도 잘 됩니다.
    println!("{숫자:>폭$}", 숫자=1, 폭=6);
    println!("{숫자:>0폭$}", 숫자=1, 폭=6);

    // 러스트는 인자의 숫자가 맞는지 검사도 해줍니다.
    println!("내 이름은 {0}요. {1} {0}.", "본드");
    // FIXME ^ 위 코드에서 빠진 인자 "제임스"를 추가해주세요.

    // fmt::Display를 정의한 타입만 `{}`를 이용해 출력할 수 있습니다.
    // 사용자 정의 타입은 기본적으로 정의하고 있지 않고요.

    // 한개의 `i32` 를 가지고 있는 `Structure`라는 구조체를 생성해봅니다.
    #[allow(dead_code)]
    struct Structure(i32);

    // `Structure`가 fmt::Display를 가지고 있지 않기 때문에, 이 코드는 컴파일조차
    // 되지 않습니다.
    println!("이 구조체 {}는 출력되지 않을겁니다.", Structure(3));
    // FIXME ^ 위 코드를 코멘트로 막아주세요.

    // 러스트 1.58부터, 근처의 변수로부터 매개변수를 가져올 수 있습니다.
    // 여기에서는 위에서처럼 "    1", 1 앞에 4개의 공백을 출력합니다.
    let number: f64 = 1.0;
    let width: usize = 5;
    println!("{number:>width$}");

    // 역주) 역시, 한글도 잘 됩니다. 그리고 사실 변수명도 한글로 잘 됩니다.
    let 숫자: f64 = 1.0;
    let 폭: usize = 6;
    println!("{숫자:->폭$}"); // -----1
}

std::fmt 에는 문자열 출력에 관련된 많은 트레잇(traits) 이 있습니다. 다음은 두개의 중요한 트레잇입니다.

  • fmt::Debug: {:?} 에 사용됩니다. 디버깅에 사용합니다.
  • fmt::Display: {} 를 사용됩니다. 보기 편하게 출력 형식을 지정하는데 사용합니다.

위의 예제에서는 표준 라이브러리가 지원하는 자료형들을 출력했기 때문에 fmt::Display를 사용했습니다. 사용자 정의 자료형을 위해서는 추가 작업이 필요합니다.

만약 fmt::Display 트레잇을 구현해주면 자동으로 ToString 트레잇이 구현되고, 해당 자료형을 String으로 변환(convert) 할 수 있게됩니다.

중간의 #[allow(dead_code)]는 바로 다음에 오는 모듈에만 적용되는 속성입니다.

Activities

  • 위 코드에서 두 개의 이슈(FIXME 라고 된 부분들)를 수정하고 오류없이 실행되게 해보세요.
  • println! 매크로의 소수점 표시 기능을 이용해서 원주율의 근사치는 3.142이다. 를 출력해보세요. 이를 위한 파이값은 let pi = 3.141592 라고 정의해주세요. (힌트: std::fmt 문서에서 소수점 표시(Precision) 항목을 참고하세요.)

참고:

std::fmt, 매크로(macros), 구조체(struct), 트레잇(traits), 미사용 코드(dead_code)

디버그

std::fmt의 형식 지정자를 사용하려면 출력할 자료형마다 해당 기능을 구현해야 합니다. 표준(std) 라이브러리의 자료형들에 대해서는 이미 구현이 되어 있지만, 다른 자료형의 경우에는 반드시 직접 구현해야만 합니다.

이때 fmt::Debug 를 이용하면 어렵지 않게 만들 수 있습니다. 어떤 자료형이든 fmt::Debug 을 이용해 파생 구현(derive)을 할 수 있습니다.

#![allow(unused)]
fn main() {
// 아래 구조체는 fmt::Display 나 fmt::Debug 중 어느것으로도 출력할 수 없습니다.
struct UnPrintable(i32);

// derive 속성을 사용하면 fmt::Debug 으로 구조체를 출력할 수 있는 코드가 구현됩니다.
#[derive(Debug)]
struct DebugPrintable(i32);
}

또한 모든 표준(std) 라이브러리의 자료형들은 {:?} 으로 출력이 가능합니다.

// Structure 구조체를 위해 fmt::Debug를 파생 구현합니다. 
// 이 구조체는 i32 한개만 가지고 있습니다.
#[derive(Debug)]
struct Structure(i32);

// 구조체 Deep의 내부에 Structure를 넣습니다. 역시 출력가능하게 만듭니다.
#[derive(Debug)]
struct Deep(Structure);

fn main() {
    // {:?}는  {}와 비슷하게 동작합니다.
    println!("1년에는 {:?}개월이 있다.", 12);
    println!("{1:?} {0:?}는 {actor:?} 이름이다.",
             "강호",
             "송",
             actor="배우의");

    // Structure 도 출력할 수 있습니다!
    println!("이번에는 {:?} 를 출력하자!", Structure(3));
    
    // 파생 구현(derive)의 단점은 출력 방식을 제어할 수 없다는 점입니다.
    // 다음 코드에서 7 만 출력되게 하려면 어찌해야 할까요.
    println!("이번에는 {:?} 를 출력하자!", Deep(Structure(7)));
}

fmt::Debug 는 출력 기능은 제공하지만, 우아함은 포기해야합니다. 러스트에는 {:#?}를 이용한 "예쁘게 출력하기" 기능도 있습니다.

#[derive(Debug)]
struct Person<'a> {
    name: &'a str,
    age: u8
}

fn main() {
    let name = "Peter";
    let age = 27;
    let peter = Person { name, age };

    // 예쁘게 출력하기
    println!("{:#?}", peter);
}

출력 형식을 바꾸려면 fmt::Display 를 직접 구현해야만 합니다.

참고:

속성(attributes), 파생 구현(derive), std::fmt, 구조체(struct)

Display

fmt::Debug의 출력은 별로 깔끔하지 않기 때문에, 출력 형태를 별도로 구현해야 하는 경우가 많습니다. 이때는 fmt::Display 트레잇을 구현하면 {} 마커로 출력할 수 있습니다. 구현하는 방법은 다음과 같습니다.

#![allow(unused)]
fn main() {
// fmt 모듈을 사용하기 위해 use 키워드로 임포트합니다.
use std::fmt;

// fmt::Display 를 구현할 구조체입니다. Structure 라는 구조체에 한 개의 i32만 넣었습니다.
struct Structure(i32);

// {} 를 이용해 출력하려면 해당 자료영에 대해 fmt::Display 트레잇이 구현되어 있어야 합니다.
impl fmt::Display for Structure {
    // 이 트레잇에는 정확한 형식의 fmt 함수가 있어야 합니다.
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // 첫번째 요소를 출력 버퍼 f 에 씁니다.
        // 리턴값은 fmt::Result 인데, 동작의 성공 여부를 나타냅니다. 
        // write! 는 println! 과 유사한 문법을 사용합니다.
        write!(f, "{}", self.0)
    }
}
}

fmt::Display 를 쓰는 편이 fmt::Debug 의 경우 보다 더 깔끔합니다만, 표준(std) 라이브러리에 구현하기에는 어려움이 있습니다. 출력 형식을 지정하기 애매한 자료형 때문입니다. 예를 들어, 표준 라이브러리에서 모든 Vec<T> 에 대해 출력방식을 구현한다면 어떤 식으로 해야 할까요? 다음 두가지 중에 어느쪽이 적절할까요?

  • Vec<path>: /:/etc:/home/username:/bin (: 로 나눠서 표시하기)
  • Vec<number>: 1,2,3 (, 로 나눠서 표시하기)

둘 다 안됩니다. 모든 자료형을 위한 이상적인 한가지 출력형식이란 있을 수 없고, 표준라이브러리가 어떤 한가지 방식을 강제해서도 안되기 때문입니다. fmt::DisplayVec<T> 나 다른 제네릭 컨테이너에 대해서는 구현되어있지 않습니다. 이런 경우에는 fmt::Debug를 사용하면 됩니다.

이것이 큰 문제는 되지 않습니다. 제네릭이 아닌 모든 새로운 컨테이너 자료형은 fmt::Display를 구현하면 되기 때문입니다.

use std::fmt; // fmt를 임포트합니다.

// 숫자 두개를 가진 구조체입니다. Debug 파생 구현도 만들어서, Display 구현과
// 비교해봅시다.
#[derive(Debug)]
struct MinMax(i64, i64);

// MinMax 를 위한 Display 구현입니다.
impl fmt::Display for MinMax {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // self.number로 해당 위치의 데이터를 가리킵니다.
        write!(f, "({}, {})", self.0, self.1)
    }
}

// 비교를 위해 이름이 있는 필드들로 구조체를 만듭시다.
#[derive(Debug)]
struct Point2D {
    x: f64,
    y: f64,
}

// 역시 Point2D 의 Display 도 구현합니다.
impl fmt::Display for Point2D {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // x와 y만 표시하도록 합니다.
        write!(f, "x: {}, y: {}", self.x, self.y)
    }
}

fn main() {
    let minmax = MinMax(0, 14);

    println!("구조체의 출력을 비교해봅시다 :");
    println!("Display: {}", minmax);
    println!("Debug: {:?}", minmax);

    let big_range =   MinMax(-300, 300);
    let small_range = MinMax(-3, 3);

    println!("넓은 범위는 {big}이고, 좁은 범위는 {small}입니다.",
             small = small_range,
             big = big_range);

    let point = Point2D { x: 3.3, y: 7.2 };

    println!("포인트 구조체도 비교합시다 :");
    println!("Display: {}", point);
    println!("Debug: {:?}", point);

    // 다음 코드에서는 컴파일 오류가 납니다. Debug와 Display는 구현되었지만,
    // {:b}는 fmt::Binary 구현이 필요하기 때문입니다. 
    // println!("Point2D 를 이진수로 출력하면 어떻게 될까: {:b}?", point);
}

fmt::Display 는 구현했지만 fmt::Binary 는 안했기 때문에 {:b} 는 사용할 수 없습니다. std::fmt 에는 많은 트레잇(traits) 이 있고 각각을 구현해주어야 합니다. 더 자세한 사항은 std::fmt 을 보아주세요.

실습

위의 출력을 확인하시고, Point2D 구조체를 참고해서 복소수(Complex 라고 명명하세요) 구조체를 만들어서 출력하면 다음처럼 나오도록 구현해보세요.

Display: 3.3 + 7.2i
Debug: Complex { real: 3.3, imag: 7.2 }

참고:

파생 구현(derive), std::fmt, 매크로(macros), 구조체(struct), 트레잇(trait), use

테스트 케이스: List

내부 요소들이 순서대로 처리되어야 하는 구조체의 경우 fmt::Display를 구현하기가 어렵습니다. 각각의 write! 들이 fmt::Result 를 리턴하는 것이 문제가 됩니다. 적절한 처리를 위해서는 각각의 리턴값을 잘 처리해야 하는데요. 러스트에는 정확히 이때 사용할 수 있는 ? 연산자가 있습니다.

write! 에서 ? 를 사용하는 방법은 다음과 같습니다.

// write! 를 실행하고 오류가 있는지 검사합니다. 만약 오류가 발생하면
// 오류를 리턴하고 아니면 계속 진행합니다.
write!(f, "{}", value)?;

연산자 ? 를 사용하면, Vecfmt::Display 구현도 쉽게 할 수 있습니다.

use std::fmt; // fmt 모듈 임포트합니다.

// Vec 을 담고있는 List 구조체를 정의합니다.
struct List(Vec<i32>);

impl fmt::Display for List {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // 튜플(tuple) 인덱싱으로 값을 가져오고, 참조를 vec 에 저장합니다.
        let vec = &self.0;

        write!(f, "[")?;

        // vec 의 요소들을 하나씩 v 로 참조하면서 순환합니다.
        // count 에는 반복 횟수가 들어갑니다.
        for (count, v) in vec.iter().enumerate() {
            // 첫 번째 요소만 빼고 쉼표를 추가합니다.
            // 오류가 발생시 리턴하기 위해 ? 연산자를 사용합니다.
            if count != 0 { write!(f, ", ")?; }
            write!(f, "{}", v)?;
        }

        // 대괄호를 닫고 fmt::Result 를 리턴합니다.
        write!(f, "]")
    }
}

fn main() {
    let v = List(vec![1, 2, 3]);
    println!("{}", v);
}

실습

코드를 수정해서 각각의 인덱스 번호도 출력되게 해보세요. 다음처럼 출력되면 성공입니다.

[0: 1, 1: 2, 2: 3]

참고

for, ref, Result, struct, ?vec!

출력 형식

출력 형식을 지정할 때 형식 지정자를 사용하는 것을 앞에서 보았습니다.

  • format!("{}", foo) -> "3735928559"
  • format!("0x{:X}", foo) -> "0xDEADBEEF"
  • format!("0o{:o}", foo) -> "0o33653337357"

동일한 변수(foo)도 X, o 등 지정된 형식에 따라 다르게 출력됩니다.

형식을 지정하는 기능은 트레잇을 통해서 구현되고, 인자의 자료형마다 트레잇이 하나씨 있습니다. 가장 자주 쓰이는 트레잇은 Display 이고, 형식을 지정하지 않는 경우 {} 를 담당합니다.

use std::fmt::{self, Formatter, Display};

struct City {
    name: &'static str,
    // 위도(Latitude)
    lat: f32,
    // 경도(Longitude)
    lon: f32,
}

impl Display for City {
    // 여기서 f 는 버퍼이며, 이 함수에서는 형식이 처리된 문자열을 버퍼에 출력해야 합니다.
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        let lat_c = if self.lat >= 0.0 { 'N' } else { 'S' };
        let lon_c = if self.lon >= 0.0 { 'E' } else { 'W' };

        // write! 는 format! 과 비슷하지만 문자열을 버퍼(첫번째 인자)에 출력합니다.
        write!(f, "{}: {:.3}°{} {:.3}°{}",
               self.name, self.lat.abs(), lat_c, self.lon.abs(), lon_c)
    }
}

#[derive(Debug)]
struct Color {
    red: u8,
    green: u8,
    blue: u8,
}

fn main() {
    for city in [
        City { name: "Dublin", lat: 53.347778, lon: -6.259722 },
        City { name: "Oslo", lat: 59.95, lon: 10.75 },
        City { name: "Vancouver", lat: 49.25, lon: -123.1 },
    ].iter() {
        println!("{}", *city);
    }
    for color in [
        Color { red: 128, green: 255, blue: 90 },
        Color { red: 0, green: 3, blue: 254 },
        Color { red: 0, green: 0, blue: 0 },
    ].iter() {
        // Color 에 대해 fmt::Display 를 구현한 다음 아래의 {:?}를 {} 로 변경해보세요.
        println!("{:?}", *color);
    }
}

std::fmt 문서에 형식지정 트레잇 전체 목록과 인자들이 있습니다.

실습

Color 구조체의 fmt::Display 트레잇을 구현해서 다음처럼 출력되게 해보세요.

RGB (128, 255, 90) 0x80FF5A
RGB (0, 3, 254) 0x0003FE
RGB (0, 0, 0) 0x000000

다음을 참고하시면 구현할 수 있습니다. :

참고:

std::fmt

기본 자료형

러스트는 다양한 종류의 기본 자료형을 제공합니다. 여기서는 그중 몇가지를 소개합니다.

단순 자료형

  • 부호있는 정수: i8, i16, i32, i64, i128, isize (포인터 사이즈)
  • 부호없는 정수: u8, u16, u32, u64, u128, usize (포인터 사이즈)
  • 실수: f32, f64
  • char 유니코드 자료형 'a', 'α', '∞' (각각 4바이트)
  • bool 자료형: true 또는 false
  • 그리고 비어있는 튜플() 만을 값으로 가지는 유닛 자료형 ().

유닛 자료형은 튜플이지만, 여러 개의 값을 가지지는 않아서 복합 자료형이 아닙니다.

복합 자료형

  • [1, 2, 3] 과 같은 배열
  • (1, true) 과 같은 튜플

모든 변수는 자료형을 지정할 수 있습니다. 숫자들은 후위표시(suffix) 로 자료형을 표시합니다. 정수는 기본적으로 i32 이고 실수는 f64 입니다. 러스트는 문맥으로부터 자료형을 추론 할 수 있습니다.

fn main() {
    // 변수에는 자료형을 지정할 수 있습니다.
    let logical: bool = true;

    let a_float: f64 = 1.0;  // 보통의 자료형 지정
    let an_integer   = 5i32; // 후위 표시

    // 또는 디폴트 자료형을 사용할 수도 있습니다.
    let default_float   = 3.0; // f64
    let default_integer = 7;   // i32
    
    // 문맥으로부터 자료형을 추론할 수도 있습니다.
    let mut inferred_type = 12; // 다른 행으로부터 i64 자료형으로 추론되었습니다.
    inferred_type = 4294967296i64;
    
    // 가변(mutable) 변수만 값을 변경할 수 있습니다.
    let mut mutable = 12; // 가변형 i32
    mutable = 21;
    
    // 에러! 변수의 자료형은 바뀔 수 없습니다.
    mutable = true;
    
    // 변수는 덮어 쓰여질 수 있습니다. 이것을 셰도우잉(shadowing)이라고 합니다.
    let mutable = true;

    /* 복합 자료형 - 배열과 튜플 */
    
    // 배열의 시그니처는 타입 T와 길이 length를 이용해 [T; length]로 정의됩니다.
    let my_array: [i32; 5] = [1, 2, 3, 4, 5];

    // 튜플은 서로 다른 타입을 갖는 값의 모음이며,
    // () 안에 들어 있습니다.
    let my_tuple = (5u32, 1u8, true, -5.04f32); // 타입: (u32, u8, bool, f32)
}

참고

표준(std) 라이브러리, mut, inference, shadowing

변수와 연산자

변수를 정의할 때는, 정수의 경우는 1, 실수는 1.2, 문자는'a', 문자열은 "abc", 불리언은 true 그리고 유닛은 () 와 같은 문법을 사용합니다.

정수는 16진수나 8진수 또는 2진수로 표시할 수 있으며, 각각의 변수 앞에 다음을 붙여서 표시합니다: 0x, 0o or 0b.

숫자 자료형은 가독성을 높이기 위해 밑줄을 넣을 수 있습니다. 예를 들면, 1_0001000과 같고, 0.000_0010.000001과 같습니다.

E-노테이션 방식으로 값을 지정할 수도 있습니다. ie6, 7.6e-4처럼요. 이들은 f64 타입이 됩니다.

변수를 정의할 때는 컴파일러에게 자료형을 알려주어야 합니다. 여기서는 u32 로 부호없는 32비트 정수형임을 알리고, i32로 부호있는 32비트 정수임을 표시했습니다.

사용가능한 연산자와 우선 순위C계열 언어와 비슷합니다.

fn main() {
    // 정수 덧셈
    println!("1 + 2 = {}", 1u32 + 2);

    // 정수 뺄셈
    println!("1 - 2 = {}", 1i32 - 2);
    // TODO ^ 1i32 를 1u32 로 바꾸고 어째서 자료형이 중요한지 확인해보세요.

    // 과학적 노테이션
    println!("1e4는 {}, -2.5e-3은 {}입니다", 1e4, -2.5e-3);

    // 불리언 로직
    println!("true AND false is {}", true && false);
    println!("true OR false is {}", true || false);
    println!("NOT true is {}", !true);

    // 비트연산
    println!("0011 AND 0101 is {:04b}", 0b0011u32 & 0b0101);
    println!("0011 OR 0101 is {:04b}", 0b0011u32 | 0b0101);
    println!("0011 XOR 0101 is {:04b}", 0b0011u32 ^ 0b0101);
    println!("1 << 5 is {}", 1u32 << 5);
    println!("0x80 >> 2 is 0x{:x}", 0x80u32 >> 2);

    // 밑줄을 이용해 보기 좋게!
    println!("One million is written as {}", 1_000_000u32);
}

튜플

튜플은 자료형이 다른 값들의 모음입니다. 괄호()로 생성하는데, 나열된 자료형 (T1, T2, ...)에 해당하는 값들을 넣을 수 있습니다. 여러 개의 값을 가지기 때문에 함수에서 한개 이상의 값을 리턴할 때 사용할 수 있습니다.

// 튜플은 함수의 인자나 리턴값에 사용될 수 있습니다.
fn reverse(pair: (i32, bool)) -> (bool, i32) {
    // let 으로 튜플의 요소들을 각각의 변수에 바인딩 할 수 있습니다.
    let (integer, boolean) = pair;

    (boolean, integer)
}

// 실습에 사용할 구조체입니다.
#[derive(Debug)]
struct Matrix(f32, f32, f32, f32);

fn main() {
    // 각기 다른 자료형들로 만들어진 튜플
    let long_tuple = (1u8, 2u16, 3u32, 4u64,
                      -1i8, -2i16, -3i32, -4i64,
                      0.1f32, 0.2f64,
                      'a', true);

    // 인덱스 번호로 튜플 내부의 값을 가져올 수 있습니다.
    println!("첫번째 요소 : {}", long_tuple.0);
    println!("두번째 요소 : {}", long_tuple.1);

    // 튜플도 튜플의 요소로 들어갈 수 있습니다.
    let tuple_of_tuples = ((1u8, 2u16, 2u32), (4u64, -1i8), -2i16);

    // 튜플을 출력해봅니다.
    println!("튜플의 튜플: {:?}", tuple_of_tuples);
    
    // 너무 긴 튜플은 출력할 수 없습니다.
    // let too_long_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13);
    // println!("too long tuple: {:?}", too_long_tuple);
    // TODO ^ 위의 두줄의 코멘트를 없애고 컴파일러 에러를 확인하세요.

    let pair = (1, true);
    println!("pair 의 값은 {:?} 입니다.", pair);

    println!("pair 를 뒤집으면 {:?} 가 됩니다.", reverse(pair));

    // 요소가 하나인 튜플도 만들 수 있습니다. 단순히 괄호로 감싼 변수와 
    // 구분하기 위해 뒤에 쉼표를 하나 넣어줘야 합니다.
    println!("요소 한 개짜리 튜플: {:?}", (5u32,));
    println!("그냥 한 개의 정수: {:?}", (5u32));

    // 튜플은 바인딩으로 분해할 수 있습니다.
    let tuple = (1, "hello", 4.5, true);

    let (a, b, c, d) = tuple;
    println!("{:?}, {:?}, {:?}, {:?}", a, b, c, d);

    let matrix = Matrix(1.1, 1.2, 2.1, 2.2);
    println!("{:?}", matrix);

}

실습

  1. 복습: 위의 예제에서 구조체 Matrix 를 위한 fmt::Display 트레잇을 추가하고, 디버깅 포맷 {:?} 을 디스플레이 포맷 {}으로 변경하고, 다음처럼 출력되게 합니다.

    ( 1.1 1.2 )
    ( 2.1 2.2 )
    

    앞서 보았던 예제를 참고하세요.

  2. reverse 함수를 기반으로 transpose 함수를 만들고, matrix 를 인자로 받도록 해서, 두 요소를 바꿔치기한 matrix 를 리턴하게 하세요.

    println!("Matrix:\n{}", matrix);
    println!("Transpose:\n{}", transpose(matrix));

    라고 하면 다음이 출력되게 만드시면 됩니다.

    Matrix:
    ( 1.1 1.2 )
    ( 2.1 2.2 )
    Transpose:
    ( 1.1 2.1 )
    ( 1.2 2.2 )
    

배열과 슬라이스

배열은 동일한 자료형(T)의 모음으로, 연속된 메모리에 저장됩니다. 대괄호 [] 로 정의하고, 길이는 [T; length] 로 컴파일 할 때 지정해야합니다.

슬라이스는 배열과 비슷한 개념이지만, 컴파일 할 때 길이를 지정하지 않는 점이 다릅니다. 대신, 객체 내부에 2개의 워드를 가집니다. 첫번째 워드는 데이터를 가리키는 포인터로 사용하고, 두번째 워드에는 슬라이스의 길이를 저장합니다. 워드는 usize와 크기가 같은데, 이 값은 CPU 에따라 달라집니다. 즉 x86-64 CPU에서는 64비트가 됩니다. 배열의 일부를 빌려올 때 사용할 수 있고, 문법은 &[T] 입니다.

use std::mem;

// 이 함수는 슬라이스의 소유권을 빌려옵니다.
fn analyze_slice(slice: &[i32]) {
    println!("전달된 슬라이스의 첫 요소는 {}입니다.", slice[0]);
    println!("전달된 슬라이스는 {}개의 요소를 가지고 있습니다.", slice.len());
}

fn main() {
    // 길이가 고정된 배열 (배열에서 지정했으므로, 추가로 자료형을 지정하지 않아도 됩니다)
    let xs: [i32; 5] = [1, 2, 3, 4, 5];

    // 모든 요소들을 같은 값으로 초기화할 수 있습니다.
    let ys: [i32; 500] = [0; 500];

    // 인덱스는 0 부터 시작합니다.
    println!("배열의 첫번째 요소: {}", xs[0]);
    println!("배열의 두번째 요소: {}", xs[1]);

    // 배열에 들어있는 요소의 갯수는 `len` 으로 알 수 있습니다.
    println!("배열의 요소의 갯수: {}", xs.len());

    // 배열은 스택에 할당됩니다.
    println!("배열에는 {} 바이트가 할당되어 있습니다.", mem::size_of_val(&xs));

    // 배열은 자동으로 슬라이스 형태로 빌릴 수 있습니다.
    println!("배열 전체를 슬라이스로 빌립니다.");
    analyze_slice(&xs);

    // 슬라이스느 배열의 특정 부분을 지정할 수 있습니다.
    // 문법은 [starting_index..ending_index] 입니다.
    // starting_index 는 슬라이스의 처음을 나타내고
    // ending_index 는 슬라이스의 마지막 위치를 나타냅니다.
    println!("배열의 일부를 슬라이스로 빌립니다");
    analyze_slice(&ys[1 .. 4]);

    // 빈 슬라이스 `&[]`의 예제입니다.
    let empty_array: [u32; 0] = []; // 역주) 타입이 없으면 컴파일되지 않습니다
    assert_eq!(&empty_array, &[]); // 역주) 이 매크로는 같으면 계속하고, 다르면 패닉을 리턴합니다.
    assert_eq!(&empty_array, &[][..]);

    // 배열은 `.get`으로 안전하게 접근할 수 있으며, `Option`을 리턴합니다.
    // 아래와 같이 match로 접근해 처리할 수도 있고,
    for i in 0..xs.len() + 1 { // 변수 하나만큼 멀리 가버렸어요!
        match xs.get(i) {
            Some(값) => println!("{}: {}", i, 값),
            None => println!("천천히요! {}는 너무 멀어요!", i),
        }
    }
    // 프로그램을 멈추게 하고 싶으면 `.expect()`를 활용할 수도 있습니다.

    // 아래 문장에서는 index out of bounds 컴파일-시간 오류가 발생합니다.
    println!("{}", xs[5]);
    // 범위 바깥을 인덱싱하려고 하면 index out of bounds 실행-시간 오류가 발생합니다.
    println!("{}", xs[..][5]);
}

사용자 정의 자료형

러스트의 사용자 정의 자료형은 두개의 키워드로 만듭니다.

  • struct: 구조체
  • enum: 열거형

상수의 경우에는 conststatic 키워드로 생성합니다.

구조체

struct 키워드로 만들 수 있는 구조체에는 3가지 타입이 있습니다.

  • 튜플 구조체. 이것은 기본적으로 네임드 튜플입니다.
  • 고전적인 C 언어의 구조체.
  • 유닛 구조체. 필드가 없는 제너릭에 유용한 구조체입니다.
// 미사용 코드에 대한 오류를 숨기기 위해
#![allow(dead_code)]

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

// 유닛 구조체
struct Unit;

// 튜플 구조체
struct Pair(i32, f32);

// 2개의 필드를 가진 구조체
struct Point {
    x: f32,
    y: f32,
}

// 구조체는 다른 구조체의 필드로 이용될 수 있습니다
struct Rectangle {
    // A rectangle can be specified by where the top left and bottom right
    // corners are in space.
    top_left: Point,
    bottom_right: Point,
}

fn main() {
    // Create struct with field init shorthand
    let name = String::from("Peter");
    let age = 27;
    let peter = Person { name, age };

    // Print debug struct
    println!("{:?}", peter);


    // Instantiate a `Point`
    let point: Point = Point { x: 5.2, y: 0.4 };
    let another_point: Point = Point { x: 10.3, y: 0.2 };

    // Access the fields of the point
    println!("point coordinates: ({}, {})", point.x, point.y);

    // Make a new point by using struct update syntax to use the fields of our
    // other one
    let bottom_right = Point { x: 10.3, ..another_point };

    // `bottom_right.y` will be the same as `point.y` because we used that field
    // from `point`
    println!("second point: ({}, {})", bottom_right.x, bottom_right.y);

    // Destructure the point using a `let` binding
    let Point { x: left_edge, y: top_edge } = point;

    let _rectangle = Rectangle {
        // struct instantiation is an expression too
        top_left: Point { x: left_edge, y: top_edge },
        bottom_right: bottom_right,
    };

    // Instantiate a unit struct
    let _unit = Unit;

    // Instantiate a tuple struct
    let pair = Pair(1, 0.1);

    // Access the fields of a tuple struct
    println!("pair contains {:?} and {:?}", pair.0, pair.1);

    // Destructure a tuple struct
    let Pair(integer, decimal) = pair;

    println!("pair contains {:?} and {:?}", integer, decimal);
}

실습

  1. 직사각형의 넓이를 계산하는 rect_area 함수를 추가해보세요. (아이템 중첩 해체를 활용해보세요)
  2. Pointf32를 인수로 받아 왼쪽 바닥의 점을 Point에서, 폭과 높이를 f32에서 계산해 Rectangle을 리턴하는 square 함수를 추가해보세요.

참고

attributes, 원본 식별자해체

열거형

enum 키워드는 여러 종류 중 하나일 수 있는 타입을 만들 수 있게 해줍니다. struct에서 유효한 어느 타입도 enum에서 유효합니다.

// 웹 이벤트를 클래스화하는 `enum`을 생성합니다.
// 이름과 타입 정보가 어떻게 선언되는지에 주목하세요:
// `PageLoad != PageUnload`이고 `KeyPress(char) != Paste(string)`입니다.
// 각각은 서로 다르고 독립적입니다.
enum WebEvent {
    // `enum`의 각 항목은 `unit-like`일수도,
    PageLoad,
    PageUnload,
    // 튜플 구조체일수도,
    KeyPress(char),
    Paste(String),
    // C 스타일 구조체일 수도 있습니다.
    Click { x: i64, y: i64 },
}

// `WebEvent` 열거형을 받아 아무것도 리턴하지 않는 함수입니다.
// 역주: 함수가 정상적으로 종료되고 로직이 이어지므로 `()`를 리턴하는 것으로
//       해석할 수 있습니다.
fn inspect(event: WebEvent) {
    match event {
        WebEvent::PageLoad => println!("page loaded"),
        WebEvent::PageUnload => println!("page unloaded"),
        // Destructure `c` from inside the `enum` variant.
        WebEvent::KeyPress(c) => println!("pressed '{}'.", c),
        WebEvent::Paste(s) => println!("pasted \"{}\".", s),
        // Destructure `Click` into `x` and `y`.
        WebEvent::Click { x, y } => {
            println!("clicked at x={}, y={}.", x, y);
        },
    }
}

fn main() {
    let pressed = WebEvent::KeyPress('x');
    // `to_owned()` creates an owned `String` from a string slice.
    let pasted  = WebEvent::Paste("my text".to_owned());
    let click   = WebEvent::Click { x: 20, y: 80 };
    let load    = WebEvent::PageLoad;
    let unload  = WebEvent::PageUnload;

    inspect(pressed);
    inspect(pasted);
    inspect(click);
    inspect(load);
    inspect(unload);
}

타입 별칭

타입 별칭을 사용하면 각 열거형의 항목을 별칭으로 호출할 수 있습니다. 열거형의 이름이 너무 길거나 너무 일반적이어서, 이름을 다시 지정하고 싶을 때 유용합니다.

enum 이름이아주긴_숫자로할수있는작업_열거형 {
    Add,
    Subtract,
}

// Creates a type alias
type Operations = 이름이아주긴_숫자로할수있는작업_열거형;

fn main() {
    // We can refer to each variant via its alias, not its long and inconvenient
    // name.
    let x = Operations::Add;
}

가장 자주 볼 수 있는 곳은 impl 블록의 Self 별칭일 것입니다.

enum VeryVerboseEnumOfThingsToDoWithNumbers {
    Add,
    Subtract,
}

impl VeryVerboseEnumOfThingsToDoWithNumbers {
    fn run(&self, x: i32, y: i32) -> i32 {
        match self {
            Self::Add => x + y,
            Self::Subtract => x - y,
        }
    }
}

열거형과 타입 별칭에 대해 더 알고 싶다면, 이 기능이 Rust로서 안정화될 때의 안정화 리포트를 참고할 수 있습니다.

함께 읽기:

match, fnString, "Type alias enum variants" RFC

use

use 선언을 통해 수동으로 스코핑하지 않고도 내부 상수를 이용할 수 있습니다:

// An attribute to hide warnings for unused code.
#![allow(dead_code)]

enum Stage {
    Beginner,
    Advanced,
}

enum Role {
    Student,
    Teacher,
}

fn main() {
    // 각각을 `use`함으로서 스코핑 없이 사용하도록 합니다
    use crate::Stage::{Beginner, Advanced};
    // `Role` 안의 모든 항목을 자동으로 `use`합니다
    use crate::Role::*;

    //`Stage::Beginner`와 동일
    let stage = Beginner;
    // `Role::Student`와 동일
    let role = Student;

    match stage {
        // 위에서 `use`를 사용함으로서 수동 스코핑이 없음을 주목하세요
        Beginner => println!("Beginners are starting their learning journey!"),
        Advanced => println!("Advanced learners are mastering their subjects..."),
    }

    match role {
        // Note again the lack of scoping.
        Student => println!("Students are acquiring knowledge!"),
        Teacher => println!("Teachers are spreading knowledge!"),
    }
}

함께 읽기:

matchuse

C 방식 열거형

enum은 C 방식 열거형도 사용 가능합니다.

// An attribute to hide warnings for unused code.
#![allow(dead_code)]

// 타입과 값을 지정하지 않은 열거형 (0i32에서 시작)
enum Number {
    Zero,
    One,
    Two,
}

// 값이 지정된 열거형
enum Color {
    Red = 0xff0000,
    Green = 0x00ff00,
    Blue = 0x0000ff,
}

fn main() {
    // C 방식 열거형은 정수형으로 타입 캐스팅할 수 있습니다.
    println!("zero is {}", Number::Zero as i32);
    println!("one is {}", Number::One as i32);

    println!("roses are #{:06x}", Color::Red as i32);
    println!("violets are #{:06x}", Color::Blue as i32);
}

See also:

casting

테스트 케이스: 연결 리스트

통상적으로 연결 리스트를 구현하는 방법은 enum을 활용하는 것입니다:

// 역주: 이렇게 하면 해당 파일 전체에서 `List` 하위 객체를
//       스코핑 없이 바로 접근할 수 있게 됩니다.
use crate::List::*;

enum List {
    // Cons: 항목과 다음 노드로의 포인터를 감싸는 튜플 구조체
    Cons(u32, Box<List>),
    // Nil: 연결 리스트의 끝을 나타내는 선언자
    Nil,
}

// 열거형도 메서드를 붙일 수 있습니다
impl List {
    // 빈 `List`를 생성합니다
    fn new() -> List {
        // `Nil`은 `List` 타입입니다
        Nil
    }

    // `List`의 소유권을 가져와, 새 항목이 맨 앞에 추가된 `List`를
    // 만들어 리턴합니다.
    fn prepend(self, elem: u32) -> List {
        // `Cons`도 `List` 타입입니다
        Cons(elem, Box::new(self))
    }

    // `List`의 길이를 리턴합니다
    fn len(&self) -> u32 {
        // `self`의 종류에 따라 행동이 바뀌기에 매칭이 필요합니다.
        // `self`는 `&List` 타입이고 `*self`는 `List` 타입이며,
        // 정적 타입 `T`가 `&T`보다 매칭에 선호됩니다.
        // Rust 2018부터는 레퍼런스 없이 self나 tail을 사용할 수 있으며,
        // 이때 러스트는 `&s`와 `ref tail`을 추정해 컴파일합니다.
        // 더 보기: https://doc.rust-lang.org/edition-guide/rust-2018/ownership-and-lifetimes/default-match-bindings.html
        match *self {
            // `self`가 빌려온 것으로 `tail`을 가져올 수 없어, 이의 레퍼런스를
            // 가져옵니다.
            Cons(_, ref tail) => 1 + tail.len(),
            // 빈 리스트는 기본적으로 0의 길이이죠.
            Nil => 0
        }
    }

    // 리스트 전체를 (heap 할당된) `String`으로 리턴합니다
    fn stringify(&self) -> String {
        match *self {
            Cons(head, ref tail) => {
                // `format!`은 `print!`와 유사하지만, 콘솔에 뿌리는 대신
                // heap 할당된 string을 리턴합니다
                format!("{}, {}", head, tail.stringify())
            },
            Nil => {
                format!("Nil")
            },
        }
    }
}

fn main() {
    // 빈 연결 리스트를 만듭니다
    let mut list = List::new();

    // 서두에 아이템 몇 개를 추가합니다
    list = list.prepend(1);
    list = list.prepend(2);
    list = list.prepend(3);

    // 최종 상태를 확인합니다
    println!("linked list has length: {}", list.len());
    println!("{}", list.stringify());
}

함께 읽기:

Boxmethods

상수

러스트에는 글로벌을 포함해 어느 스코프에서나 선언될 수 있는 두 가지의 상수가 있습니다. 특정한 타입 어노테이션이 요구됩니다:

  • const: 변경되지 않는 값 (일반적인 경우)
  • static: 'static 라이프타임을 가지는 값이 바뀔 수 있는 변수입니다. static 라이프타입은 추정되므로 선언할 필요가 없습니다. static mutable 변수에 접근하거나 수정하는 행위는 unsafe로 정의됩니다.
// 글로벌 상수는 모든 스코프 밖에 선언됩니다
static LANGUAGE: &str = "Rust";
const THRESHOLD: i32 = 10;

fn is_big(n: i32) -> bool {
    // 함수 안에서 상수 접근
    n > THRESHOLD
}

fn main() {
    let n = 16;

    // 메인 스레드에서 상수 접근
    println!("This is {}", LANGUAGE);
    println!("The threshold is {}", THRESHOLD);
    println!("{} is {}", n, if is_big(n) { "big" } else { "small" });

    // 오류! `const`는 수정할 수 없습니다.
    THRESHOLD = 5;
    // FIXME ^ Comment out this line
}

함께 읽기:

The const/static RFC, 'static lifetime

변수 할당

러스트는 정적 타입을 이용해 타입 안전을 제공합니다. 이에 따라 변수를 할당할 때 타입을 지정할 수 있습니다. 하지만 대부분의 경우, 컴파일러가 타입을 사용으로부터 추정할 수 있기 때문에, 타입 선언의 필요성이 크게 줄어듭니다.

리터럴과 같은 모든 값은 let 바인딩을 통해 변수에 할당할 수 있습니다.

fn main() {
    let an_integer = 1u32;
    let a_boolean = true;
    let unit = ();

    // copy `an_integer` into `copied_integer`
    let copied_integer = an_integer;

    println!("An integer: {:?}", copied_integer);
    println!("A boolean: {:?}", a_boolean);
    println!("Meet the unit value: {:?}", unit);

    // The compiler warns about unused variable bindings; these warnings can
    // be silenced by prefixing the variable name with an underscore
    let _unused_variable = 3u32;

    let noisy_unused_variable = 2u32;
    // FIXME ^ Prefix with an underscore to suppress the warning
}

수정 가능성

변수 할당은 기본적으로 수정 불가능하지만, mut 수정자를 통해 덮어씌울 수 있습니다.

fn main() {
    let _immutable_binding = 1;
    let mut mutable_binding = 1;

    println!("Before mutation: {}", mutable_binding);

    // Ok
    mutable_binding += 1;

    println!("After mutation: {}", mutable_binding);

    // Error!
    _immutable_binding += 1;
    // FIXME ^ Comment out this line
}

컴파일러는 수정 가능성 오류에 대해 자세한 분석을 리턴합니다.

스코프와 섀도잉

변수 할당은 스코프를 가지며, 블록 안에서 존재하는 것으로 간주됩니다. 블록은 중괄호 {}로 묶인 선언문 덩어리입니다.

fn main() {
    // 이 할당문은 main 함수에 있습니다.
    let long_lived_binding = 1;

    // 이것이 블록이며, main 함수보다 작은 스코프입니다.
    {
        // 이 바인딩은 이 스코프에만 있습니다.
        let short_lived_binding = 2;

        println!("inner short: {}", short_lived_binding);
    }
    // 블록 끝

    // 오류! `short_lived_binding`이 더이상 존재하지 않습니다.
    println!("outer short: {}", short_lived_binding);
    // FIXME ^ Comment out this line

    println!("outer long: {}", long_lived_binding);
}

또, 변수 섀도잉도 가능합니다.

fn main() {
    let shadowed_binding = 1;

    {
        println!("before being shadowed: {}", shadowed_binding);

        // This binding *shadows* the outer one
        let shadowed_binding = "abc";

        println!("shadowed in inner block: {}", shadowed_binding);
    }
    println!("outside inner block: {}", shadowed_binding);

    // This binding *shadows* the previous binding
    let shadowed_binding = 2;
    println!("shadowed in outer block: {}", shadowed_binding);
}

우선 선언하기

변수 할당을 우선 선언하고 나중에 초기화할 수도 있지만, 모든 변수는 사용되기 전에 초기화되어야 합니다: 컴파일러가 초기화되지 않은 변수의 사용을 금지하며, 이는 지정되지 않은 행위로 이어질 수 있기 때문입니다.

변수의 선언과 초기화를 따로 하는 경우는 흔치 않습니다. 코드를 읽을 때 초기화문이 떨어져 있으면 이를 찾기도 어렵고요. 변수 자체를 그 변수가 사용되는 곳 근처에 선언하는 경우가 일반적입니다.

fn main() {
    // Declare a variable binding
    let a_binding;

    {
        let x = 2;

        // Initialize the binding
        a_binding = x * x;
    } // 역주: a_binding을 초기화하기 위해 사용한 x: i32는 이제 없습니다

    println!("a binding: {}", a_binding);

    let another_binding;

    // Error! Use of uninitialized binding
    println!("another binding: {}", another_binding);
    // FIXME ^ Comment out this line

    another_binding = 1;

    println!("another binding: {}", another_binding);
}

고정하기

수정 가능한 변수가 같은 이름의 수정되지 않는 변수로 추가 선언되면, 고정 상태가 됩니다. 고정된 데이터는 수정되지 않는 변수의 스코프 밖으로 나갈 때까지 수정할 수 없게 됩니다:

fn main() {
    let mut _mutable_integer = 7i32;

    {
        // Shadowing by immutable `_mutable_integer`
        let _mutable_integer = _mutable_integer;

        // Error! `_mutable_integer` is frozen in this scope
        _mutable_integer = 50;
        // FIXME ^ Comment out this line

        // `_mutable_integer` goes out of scope
    }

    // Ok! `_mutable_integer` is not frozen in this scope
    _mutable_integer = 3;
}

자료형 다루기

러스트는 사전 제공 또는 사용자 지정된 타입에 대해 여러 수정 및 정의법을 제공합니다. 다음 섹션에서는 다음을 다룹니다:

타입 캐스팅

러스트는 기본 타입에서 특정하지 않은 경우의 타입 변경(간주하기)을 지원하지 않습니다. 하지만 특정된 타입 변환(캐스팅)은 as 키워드를 통해 제공합니다.

일반적인 경우 타입 변환은 C를 준수하지만, C에서 정의되지 않는 경우는 그렇지 않습니다. 러스트는 모든 숫자 타입 변환 방식에 대해 자세하게 정의하고 있습니다.

// Suppress all warnings from casts which overflow.
#![allow(overflowing_literals)]

fn main() {
    let decimal = 65.4321_f32;

    // Error! 특정하지 않은 타입 변경 불가
    let integer: u8 = decimal;
    // FIXME ^ Comment out this line

    // 특정된 타입 변환
    let integer = decimal as u8;
    let character = integer as char;

    // Error! 타입 변환에는 제한이 있습니다.
    // float는 char로 바로 변환할 수 없습니다.
    let character = decimal as char;
    // FIXME ^ Comment out this line

    println!("Casting: {} -> {} -> {}", decimal, integer, character);

    // 값을 음수가 없는 타입 T로 캐스팅할 때, 값이 새 타입에 맞아들어갈 때까지
    // T::MAX + 1을 더하거나 뺍니다.

    // 1000은 이미 u16에 들어갑니다
    println!("1000 as a u16 is: {}", 1000 as u16);

    // 1000 - 256 - 256 - 256 = 232
    // 내부적으로, 낮은 8비트 값(LSB)은 유지되고 높은 8비트 값(MSB)는 삭제됩니다.
    println!("1000 as a u8 is : {}", 1000 as u8);
    // -1 + 256 = 255
    println!("  -1 as a u8 is : {}", (-1i8) as u8);

    // 양수에서 이 작업은 modulus와 같습니다
    println!("1000 mod 256 is : {}", 1000 % 256);

    // 음수가 있는 타입으로 변환하면 (이진 수준) 결과가 음수가 없는 것과 동일하게
    // 변환됩니다. 가장 높은 비트가 1이면 음수가 됩니다.

    // 당연히, 이미 맞는 경우엔 딱히 뭐가 없죠.
    println!(" 128 as a i16 is: {}", 128 as i16);

    // 경계면의 경우로, 128의 8비트 2의 보수는 -128입니다
    println!(" 128 as a i8 is : {}", 128 as i8);

    // 위의 예시를 반복합니다
    // 1000 as u8 -> 232
    println!("1000 as a u8 is : {}", 1000 as u8);
    // 그리고 232의 2의 보수는 -24죠
    println!(" 232 as a i8 is : {}", 232 as i8);
    
    // 러스트 1.45부터, `as` 키워드가 float -> int 캐스팅에서 
    // *saturating cast*로 동작합니다. 부동소수점 값이 최대값을 넘거나 최소값보다
    // 작아지면, 그 결과값은 그 경계값이 됩니다.
    
    // 300.0 as u8 is 255
    println!("300.0 is {}", 300.0_f32 as u8);
    // -100.0 as u8 is 0
    println!("-100.0 as u8 is {}", -100.0_f32 as u8);
    // nan as u8 is 0
    println!("nan as u8 is {}", f32::NAN as u8);
    
    // 약간의 런타임 비용을 소모하며, unsafe 메서드를 이용해 피할 수 있습니다.
    // 결과값이 오버플로우되어 **이상한 값**을 리턴하는 경우도 있으니,
    // 현명하게 사용해 주세요!
    unsafe {
        // 역주: unsafe fn {f32, f64}.to_int_unchecked::<T> -> T
        //       where T is one of
        //         {u8 u16, u32, u64, u128, i8, i16, i32, i64, i128}

        // 300.0 is 44
        println!("300.0 is {}", 300.0_f32.to_int_unchecked::<u8>());
        // -100.0 as u8 is 156
        println!("-100.0 as u8 is {}", (-100.0_f32).to_int_unchecked::<u8>());
        // nan as u8 is 0
        println!("nan as u8 is {}", f32::NAN.to_int_unchecked::<u8>());
    }
}

타입 리터럴

숫자 리터럴은 접미어를 통해 타입 선언될 수 있습니다. 예를 들어 42i32 타입이어야 하는 경우, 42i32로 선언하면 됩니다.

접미어 없이 선언된 숫자 리터럴은 사용에 따라 타입이 지정됩니다. 맥락을 확인할 수 없는 경우, 컴파일러는 정수는 i32, 부동소수점은 f64로 지정합니다.

fn main() {
    // Suffixed literals, their types are known at initialization
    let x = 1u8;
    let y = 2u32;
    let z = 3f32;

    // Unsuffixed literals, their types depend on how they are used
    let i = 1;
    let f = 1.0;

    // `size_of_val` returns the size of a variable in bytes
    println!("size of `x` in bytes: {}", std::mem::size_of_val(&x));
    println!("size of `y` in bytes: {}", std::mem::size_of_val(&y));
    println!("size of `z` in bytes: {}", std::mem::size_of_val(&z));
    println!("size of `i` in bytes: {}", std::mem::size_of_val(&i));
    println!("size of `f` in bytes: {}", std::mem::size_of_val(&f));
}

위 코드에는 아직 설명하지 않은 컨셉이 사용되어, 급한 분들을 위해 짧은 설명을 덧붙입니다:

  • std::mem::size_of_val은 함수이지만 전체 경로를 통해 호출되었습니다. 코드는 모듈이라는 논리적 단위로 나뉩니다. 여기서는 size_of_val 함수는 mem 모듈에, mem 모듈은 std 크레이트에 선언되어 있습니다. 더 자세한 것은 모듈크레이트를 참고하세요.

타입 추정

타입 추정 엔진은 아주 똑똑합니다. 선언할 때의 값만 보는 것이 아니라, 그 변수가 이후 어떻게 이용되는지까지 모두 추적해 타입을 추정합니다. 여기 타입 추정에 대한 고급 예시가 있습니다:

fn main() {
    // 접미사에 의해 컴파일러는 `elem`이 `u8`임을 확인합니다.
    let elem = 5u8;

    // 빈 벡터(늘어날 수 있는 배열) 선언
    let mut vec = Vec::new();
    // 여기에서 컴파일러는 `vec`이 정확하게 어느 타입인지 모릅니다. 그저 무언가의
    // 벡터(`Vec<_>`)라는 것 정도만 확인하죠.

    // `elem`을 이제 vec에 넣어봅시다.
    vec.push(elem);
    // 아하! 이 시점에서 컴파일러는 `vec`이 `u8`의 벡터(`Vec<u8>`)임을 확인합니다.
    // TODO ^ 위의 `vec.push(elem)` 줄을 주석 처리해보세요

    println!("{:?}", vec);
}

변수에 대한 타입 선언 없이도, 컴파일러도 행복하고 프로그래머도 행복합니다!

별명 만들기

type 선언문은 기존 타입에 새로운 이름을 제공합니다. 타입 이름은 반드시 UpperCamelCase(대문자 시작 카멜-케이스)로 지어야 하며, 아닌 경우 컴파일러에서 경고를 낼 겁니다. usizef32 등의 사전 정의된 타입은 예외입니다.

// `NanoSecond`, `Inch`, and `U64` are new names for `u64`.
type NanoSecond = u64;
type Inch = u64;
type U64 = u64;

fn main() {
    // `NanoSecond` = `Inch` = `U64` = `u64`.
    let nanoseconds: NanoSecond = 5 as u64;
    let inches: Inch = 2 as U64;

    // Note that type aliases *don't* provide any extra type safety, because
    // aliases are *not* new types
    println!("{} nanoseconds + {} inches = {} unit?",
             nanoseconds,
             inches,
             nanoseconds + inches);
}

이런 타입 별명은 불필요한 코드를 줄이기 위해 사용됩니다. 예를 들어

  • io::Result<T>Result<T, io::Error>의 별칭입니다.
  • 역주: Some<T>()None::<T>은 각각 std::option::Option<T>::Some()std::option::Option<T>::None의 별칭입니다.
    • Option<T>는 다시 std::option::Option<T>의 별칭입니다.

함께 보기:

속성

타입 변환

기본 타입은 캐스팅을 통해 변환이 가능합니다.

러스트에서는 커스텀 타입(struct, enum과 같은)간의 전환을 트레잇을 이용해 진행합니다. 기본적인 변환은 FromInto 트레잇을 이용합니다. 하지만 더 흔한 경우에 대해 특정한 행위가 있을 수 있습니다. 특히 String으로, 또는 String에서 전환하는 경우엔 말이죠.

FromInto

FromInto는 내부적으로 연결되어 있고, 실제로 구현의 일부입니다. 타입 A에서 B로 변환이 가능하다면, B에서 A로 또한 손쉽게 가능하다고 확신할 수 있죠.

From

From 트레잇은 다른 타입에서 자신을 만드는 방법을 정의하게 합니다. 이로서 서로 다른 많은 타입과 전환하는 아주 간단한 방식을 제공할 수 있죠. 표준 라이브러리만에서도 이미 사전 정의 타입과 표준 타입 간의 변환을 수행하는 수많은 구현체가 있습니다.

예를 들어 &str에서 String으로 아주 간단하게 변환할 수 있죠.

#![allow(unused)]
fn main() {
let my_str = "hello";
let my_string = String::from(my_str);
}

비슷한 동작을 우리가 만든 타입에서도 변환을 선언해 진행할 수 있습니다.

use std::convert::From;

#[derive(Debug)]
struct Number {
    value: i32,
}

impl From<i32> for Number {
    fn from(item: i32) -> Self {
        Number { value: item }
    }
}

fn main() {
    let num = Number::from(30);
    println!("My number is {:?}", num);
}

Into

Into 트레잇은 From 트레잇의 반대 동작입니다. 타입을 다른 타입으로 전환하는 방법을 선언하죠.

into()를 호출하는 경우 도착 타입을 지정해야 합니다. 대부분의 경우 컴파일러가 타입 추정을 확실하게 하기 어렵거든요.

use std::convert::Into;

#[derive(Debug)]
struct Number {
    value: i32,
}

impl Into<Number> for i32 {
    fn into(self) -> Number {
        Number { value: self }
    }
}

fn main() {
    let int = 5;
    // 타입 어노테이션을 지워보세요
    let num: Number = int.into();
    println!("My number is {:?}", num);
}

FromInto는 상호 전환이 가능합니다

FromInto는 상호 전환이 가능하도록 설계되어 있어, 두 트레잇 모두에 대해 선언할 필요가 없습니다. From 트레잇을 타입에 선언했다면 Into는 필요할 때 선언됩니다. 하지만 반대 방향은 그렇지 않습니다: Into를 선언한 타입은 From 트레잇을 자동으로 선언하지 않습니다.

use std::convert::From;

#[derive(Debug)]
struct Number {
    value: i32,
}

// `From`을 선언합니다
impl From<i32> for Number {
    fn from(item: i32) -> Self {
        Number { value: item }
    }
}

fn main() {
    let int = 5;
    // `Into`를 사용합니다
    let num: Number = int.into();
    println!("My number is {:?}", num);
}

TryFromTryInto

FromInto와 비슷하게, TryFromTryInto는 두 타입간 변환을 수행하는 표준 트레잇입니다. From/Into와 다르게 TryFrom/TryInto는 실패할 수 있는 변환을 위해 사용하며, 그에 따라 Result를 반환하죠.

use std::convert::TryFrom;
use std::convert::TryInto;

#[derive(Debug, PartialEq)]
struct EvenNumber(i32);

impl TryFrom<i32> for EvenNumber {
    type Error = ();

    fn try_from(value: i32) -> Result<Self, Self::Error> {
        if value % 2 == 0 {
            Ok(EvenNumber(value))
        } else {
            Err(())
        }
    }
}

fn main() {
    // TryFrom

    assert_eq!(EvenNumber::try_from(8), Ok(EvenNumber(8)));
    assert_eq!(EvenNumber::try_from(5), Err(()));

    // TryInto

    let result: Result<EvenNumber, ()> = 8i32.try_into();
    assert_eq!(result, Ok(EvenNumber(8)));
    let result: Result<EvenNumber, ()> = 5i32.try_into();
    assert_eq!(result, Err(()));
}

String으로, String에서

문자열로 변환하기

어떤 타입을 String으로 변환하는 것은 ToString 트레잇을 타입에 구현하는 것 정도로 아주 간단합니다만, 이를 직접 하는 것보다는 ToString과 [print!]에서 다룬 타입 프린팅을 자동으로 제공하는 fmt::Display 트레잇을 구현해야 합니다.

use std::fmt;

struct Circle {
    radius: i32
}

impl fmt::Display for Circle {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "Circle of radius {}", self.radius)
    }
}

fn main() {
    let circle = Circle { radius: 6 };
    println!("{}", circle.to_string());
}

문자열 파싱하기

문자열을 여러 타입으로 변환하는 것은 매우 유용하지만, 대부분의 일반적인 문자열 작업은 이를 수로 변환하는 것입니다. 이상적인 접근은 parse 함수를 사용하고 타입 추정으로 정형화하거나, 고속 파싱을 이용하는 것입니다. 두 방식 모두 아래에 잘 설명되어 있습니다.

fn main() {
    let parsed: i32 = "5".parse().unwrap();
    let turbo_parsed = "10".parse::<i32>().unwrap();

    let sum = parsed + turbo_parsed;
    println!("Sum: {:?}", sum);
}

이런 작업을 사용자 지정 타입에 가져오려면 FromStr 트레잇을 그 타입에 구현하면 됩니다.

use std::num::ParseIntError;
use std::str::FromStr;

#[derive(Debug)]
struct Circle {
    radius: i32,
}

impl FromStr for Circle {
    type Err = ParseIntError;
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        match s.trim().parse() {
            Ok(num) => Ok(Circle{ radius: num }),
            Err(e) => Err(e),
        }
    }
}

fn main() {
    let radius = "    3 ";
    let circle: Circle = radius.parse().unwrap();
    println!("{:?}", circle);
}

표현식

(대부분의) 러스트 프로그램은 선언문들로 이루어져 있습니다.

fn main() {
    // statement
    // statement
    // statement
}

러스트엔 몇몇 종류의 선언문이 있습니다. 가장 흔한 것은 변수 선언과 표현식 끝에 ;를 두는 것이죠:

fn main() {
    // variable binding
    let x = 5;

    // expression;
    x;
    x + 1;
    15;
}

블록 또한 표현식이어서, 값이나 대입에 사용될 수 있습니다. 블록의 가장 마지막 표현식의 결과가 내부 변수와 같은 대응식에 전달됩니다. 하지만 마지막 표현식이 ;로 끝난다면, ()가 리턴됩니다.

fn main() {
    let x = 5u32;

    let y = {
        let x_squared = x * x;
        let x_cube = x_squared * x;

        // This expression will be assigned to `y`
        x_cube + x_squared + x
    };

    let z = {
        // The semicolon suppresses this expression and `()` is assigned to `z`
        2 * x;
    };

    println!("x is {:?}", x);
    println!("y is {:?}", y);
    println!("z is {:?}", z);
}

흐름 제어

프로그래밍 언어에서 필수적인 부분은 if/else, for 등과 같이 흐름을 제어하는 방법입니다. 러스트에서 이게 어떻게 동작하는지 알아봅시다.

if/else

ifelse를 이용한 분기는 다른 언어와 유사합니다. 하지만 대부분과 달리 이진 조건문을 괄호로 감쌀 필요가 없으며, 각 조건문 뒤에는 블록이 따라옵니다. if-else 조건문은 표현식이며, 모든 분기는 같은 타입을 반환해야 합니다.

fn main() {
    let n = 5;

    if n < 0 {
        print!("{}은/는 음수입니다", n);
    } else if n > 0 {
        print!("{}은/는 양수입니다", n);
    } else {
        print!("{}는 0입니다", n);
    }

    let big_n =
        if n < 10 && n > -10 {
            println!(", 그리고 작은 수이므로 10을 곱하겠습니다.");

            // 이 표현식은 `i32`를 반환합니다.
            10 * n
        } else {
            println!(", 그리고 큰 수이므로 2로 나누겠습니다.");

            // 위 표현식이 `i32`를 반환하므로, 이 표현식도 `i32`를 반환해야 합니다.
            n / 2
            // TODO ^ 이 표현식 뒤에 `;`를 넣어 보세요
        };
    //   ^ 여기에 세미콜론을 붙여야 함을 기억하세요! 모든 `let` 바인딩 끝에는 세미콜론이 필요합니다.

    println!("{} -> {}", n, big_n);
}

loop

러스트는 무한 반복을 위해 loop 키워드를 제공합니다.

break 선언문을 이용해 무한 반복을 언제든 빠져나올 수 있으며, continue 선언문을 이용해 남은 순서를 건너뛰고 다음 반복으로 넘어갈 수 있습니다.

fn main() {
    let mut count = 0u32;

    println!("Let's count until infinity!");

    // Infinite loop
    loop {
        count += 1;

        if count == 3 {
            println!("three");

            // Skip the rest of this iteration
            continue;
        }

        println!("{}", count);

        if count == 5 {
            println!("OK, that's enough");

            // Exit this loop
            break;
        }
    }
}

중첩과 라벨링

반복문이 중첩되었을 때 바깥의 반복문을 break하거나 continue할 수도 있습니다. 이 경우 반복문을 '라벨과 함께 선언해야 하며, 대상이 되는 라벨을 break/continue문에 제공해야 합니다.

#![allow(unreachable_code)]

fn main() {
    'outer: loop {
        println!("Entered the outer loop");

        'inner: loop {
            println!("Entered the inner loop");

            // This would break only the inner loop
            //break;

            // This breaks the outer loop
            break 'outer;
        } // end of 'inner loop

        println!("This point will never be reached");
    } // end of 'outer loop

    println!("Exited the outer loop");
}

loop에서 리턴하기

loop의 용법 중 하나는 작업이 성공할 때까지 반복하는 겁니다. 하지만 작업이 값을 반환하면 나머지를 건너뛸 수도 있겠죠. 반환할 값을 break 뒤에 얹어주세요. 값이 loop 제어문에서 반환될 겁니다.

fn main() {
    let mut counter = 0;

    let result = loop {
        counter += 1;

        if counter == 10 {
            break counter * 2;
        }
    };

    assert_eq!(result, 20);
}

while

while 키워드는 조건이 참인 동안 반복할 때 사용할 수 있습니다.

유명한 FizzBuzz 예제를 while 반복을 이용해 작성해봅시다.

fn main() {
    // A counter variable
    let mut n = 1;

    // Loop while `n` is less than 101
    while n < 101 {
        if n % 15 == 0 {
            println!("fizzbuzz");
        } else if n % 3 == 0 {
            println!("fizz");
        } else if n % 5 == 0 {
            println!("buzz");
        } else {
            println!("{}", n);
        }

        // Increment counter
        n += 1;
    }
}

for 반복문

for와 range

for in 생성자는 Iterator를 이용해 순서를 매길 수 있습니다. 가장 쉬운 방법은 범위 노테이션 a..b를 이용하는 거죠. 이는 a(포함)부터 b(미포함)까지 1 단위로 증가하는 값을 선언합니다.

FizzBuzz를 while 대신 for를 이용해 작성해봅시다.

fn main() {
    // `n` will take the values: 1, 2, ..., 100 in each iteration
    for n in 1..101 {
        if n % 15 == 0 {
            println!("fizzbuzz");
        } else if n % 3 == 0 {
            println!("fizz");
        } else if n % 5 == 0 {
            println!("buzz");
        } else {
            println!("{}", n);
        }
    }
}

또는, a..=b를 이용할 수도 있습니다. 이 표현은 양쪽 끝이 범위에 포함됩니다. 그래서 위 코드는 아래처럼 작성될 수도 있습니다.

fn main() {
    // `n` will take the values: 1, 2, ..., 100 in each iteration
    for n in 1..=100 {
        if n % 15 == 0 {
            println!("fizzbuzz");
        } else if n % 3 == 0 {
            println!("fizz");
        } else if n % 5 == 0 {
            println!("buzz");
        } else {
            println!("{}", n);
        }
    }
}

for와 반복자

for in 구조는 Iterator를 여러 방법으로 접근할 수 있습니다. 반복자 트레잇에서 얘기했듯이, for 반복문은 집합에서 into_iter 함수를 적용합니다. 하지만 이것만이 집합을 반복자로 변환하는 방법은 아닙니다.

into_iter, iteriter_mut는 내부의 데이터에 대한 서로 다른 시야를 제공함으로서 집합을 반복자로 변환합니다.

  • iter - 매 반복마다 집합에서 각각의 항목을 빌려옵니다. 원래 집합을 가져오지 않으므로, 원래 집합을 반복문 후에도 활용할 수 있죠.
fn main() {
    let names = vec!["Bob", "Frank", "Ferris"];

    for name in names.iter() {
        match name {
            &"Ferris" => println!("There is a rustacean among us!"),
            _ => println!("Hello {}", name),
        }
    }
}
  • into_iter - 원래 집합을 가져다 반복자로 만들기 때문에 매 반복마다 정확히 그 데이터가 제공됩니다. 원래 집합을 그대로 가져다 썼기 때문에 그 집합이 반복문으로 '이동'했고, 반복 이후에 해당 집합을 다시 사용할 수도 없습니다.
fn main() {
    let names = vec!["Bob", "Frank", "Ferris"];

    for name in names.into_iter() {
        match name {
            "Ferris" => println!("There is a rustacean among us!"),
            _ => println!("Hello {}", name),
        }
    }
}
  • iter_mut - 매 반복마다 항목을 수정 가능한 형태로 빌려오며, 그 값이 변경될 수 있도록 해줍니다.
fn main() {
    let mut names = vec!["Bob", "Frank", "Ferris"];
    //  ^^^ 역주: iter_mut를 호출하려면 원본이 수정 가능해야 합니다. 따라서 mut 키워드를 붙입니다.

    for name in names.iter_mut() {
        *name = match name {
            &mut "Ferris" => "There is a rustacean among us!",
            _ => "Hello",
        }
    }

    println!("names: {:?}", names);
}

위 예시에서 match 분기의 타입을 확인하세요. 이 부분이 각 반복자 생성 함수의 핵심 차이점입니다. 타입이 다르다는 것은 또한 서로 다른 작업이 가능하다는 것을 의미하기도 합니다.

함께 읽기:

Iterator

match

러스트는 패턴 매칭 과정에서 C의 switch와 비슷하게 활용할 수 있는 match 키워드를 제공합니다.

fn main() {
    let number = 13;
    // TODO ^ Try different values for `number`

    println!("Tell me about {}", number);
    match number {
        // Match a single value
        1 => println!("One!"),
        // Match several values
        2 | 3 | 5 | 7 | 11 => println!("This is a prime"),
        // Match an inclusive range
        13..=19 => println!("A teen"),
        // Handle the rest of cases
        _ => println!("Ain't special"),
    }

    let boolean = true;
    // Match is an expression too
    let binary = match boolean {
        // The arms of a match must cover all the possible values
        false => 0,
        true => 1,
        // TODO ^ Try commenting out one of these arms
    };

    println!("{} -> {}", boolean, binary);
}

아이템 해체

match 블록을 이용해 아이템을 여러 방법으로 해체할 수 있습니다.

튜플

튜플은 match로 아래와 같이 해체할 수 있습니다:

fn main() {
    let triple = (0, -2, 3);
    // TODO ^ Try different values for `triple`

    println!("Tell me about {:?}", triple);
    // Match can be used to destructure a tuple
    match triple {
        // Destructure the second and third elements
        (0, y, z) => println!("First is `0`, `y` is {:?}, and `z` is {:?}", y, z),
        (1, ..)  => println!("First is `1` and the rest doesn't matter"),
        // `..` can be the used ignore the rest of the tuple
        _      => println!("It doesn't matter what they are"),
        // `_` means don't bind the value to a variable
    }
}

함께 읽기:

Tuples

배열/슬라이스

튜플과 같이, 배열과 슬라이스도 이런 식으로 해체할 수 있습니다:

fn main() {
    // Try changing the values in the array, or make it a slice!
    let array = [1, -2, 6];

    match array {
        // Binds the second and the third elements to the respective variables
        [0, second, third] =>
            println!("array[0] = 0, array[1] = {}, array[2] = {}", second, third),

        // Single values can be ignored with _
        [1, _, third] => println!(
            "array[0] = 1, array[2] = {} and array[1] was ignored",
            third
        ),

        // You can also bind some and ignore the rest
        [-1, second, ..] => println!(
            "array[0] = -1, array[1] = {} and all the other ones were ignored",
            second
        ),
        // The code below would not compile
        // [-1, second] => ...

        // Or store them in another array/slice (the type depends on
        // that of the value that is being matched against)
        [3, second, tail @ ..] => println!(
            "array[0] = 3, array[1] = {} and the other elements were {:?}",
            second, tail
        ),

        // Combining these patterns, we can, for example, bind the first and
        // last values, and store the rest of them in a single array
        [first, middle @ .., last] => println!(
            "array[0] = {}, middle = {:?}, array[2] = {}",
            first, middle, last
        ),
    }
}

함께 읽기:

배열과 슬라이스, 바인딩 for @ sigil

열거형

열거형은 이런 식으로 해체할 수 있습니다:

// `allow` required to silence warnings because only
// one variant is used.
#[allow(dead_code)]
enum Color {
    // These 3 are specified solely by their name.
    Red,
    Blue,
    Green,
    // These likewise tie `u32` tuples to different names: color models.
    RGB(u32, u32, u32),
    HSV(u32, u32, u32),
    HSL(u32, u32, u32),
    CMY(u32, u32, u32),
    CMYK(u32, u32, u32, u32),
}

fn main() {
    let color = Color::RGB(122, 17, 40);
    // TODO ^ Try different variants for `color`

    println!("What color is it?");
    // An `enum` can be destructured using a `match`.
    match color {
        Color::Red   => println!("The color is Red!"),
        Color::Blue  => println!("The color is Blue!"),
        Color::Green => println!("The color is Green!"),
        Color::RGB(r, g, b) =>
            println!("Red: {}, green: {}, and blue: {}!", r, g, b),
        Color::HSV(h, s, v) =>
            println!("Hue: {}, saturation: {}, value: {}!", h, s, v),
        Color::HSL(h, s, l) =>
            println!("Hue: {}, saturation: {}, lightness: {}!", h, s, l),
        Color::CMY(c, m, y) =>
            println!("Cyan: {}, magenta: {}, yellow: {}!", c, m, y),
        Color::CMYK(c, m, y, k) =>
            println!("Cyan: {}, magenta: {}, yellow: {}, key (black): {}!",
                c, m, y, k),
        // Don't need another arm because all variants have been examined
    }
}

함께 읽기:

#[allow(...)], 색상 모델enum

포인터/레퍼런스

C언어 같은 언어들에서와 그 컨셉이 다르기 때문에, 포인터에서는 해체 과정에 비구조화와 비참조화 과정이 필요합니다.

  • 비참조화는 *를 사용합니다
  • 비구조화는 &, ref와/또는 ref mut를 사용합니다
fn main() {
    // Assign a reference of type `i32`. The `&` signifies there
    // is a reference being assigned.
    let reference = &4;

    match reference {
        // If `reference` is pattern matched against `&val`, it results
        // in a comparison like:
        // `&i32`
        // `&val`
        // ^ We see that if the matching `&`s are dropped, then the `i32`
        // should be assigned to `val`.
        &val => println!("Got a value via destructuring: {:?}", val),
    }

    // To avoid the `&`, you dereference before matching.
    match *reference {
        val => println!("Got a value via dereferencing: {:?}", val),
    }

    // What if you don't start with a reference? `reference` was a `&`
    // because the right side was already a reference. This is not
    // a reference because the right side is not one.
    let _not_a_reference = 3;

    // Rust provides `ref` for exactly this purpose. It modifies the
    // assignment so that a reference is created for the element; this
    // reference is assigned.
    let ref _is_a_reference = 3;

    // Accordingly, by defining 2 values without references, references
    // can be retrieved via `ref` and `ref mut`.
    let value = 5;
    let mut mut_value = 6;

    // Use `ref` keyword to create a reference.
    match value {
        ref r => println!("Got a reference to a value: {:?}", r),
    }

    // Use `ref mut` similarly.
    match mut_value {
        ref mut m => {
            // Got a reference. Gotta dereference it before we can
            // add anything to it.
            *m += 10;
            println!("We added 10. `mut_value`: {:?}", m);
        },
    }
}

함께 읽기:

The ref pattern

구조체

비슷하게, 구조체도 아래와 같이 해체할 수 있습니다:

fn main() {
    struct Foo {
        x: (u32, u32),
        y: u32,
    }

    // Try changing the values in the struct to see what happens
    let foo = Foo { x: (1, 2), y: 3 };

    match foo {
        Foo { x: (1, b), y } => println!("First of x is 1, b = {},  y = {} ", b, y),

        // you can destructure structs and rename the variables,
        // the order is not important
        Foo { y: 2, x: i } => println!("y is 2, i = {:?}", i),

        // and you can also ignore some variables:
        Foo { y, .. } => println!("y = {}, we don't care about x", y),
        // this will give an error: pattern does not mention field `x`
        //Foo { y } => println!("y = {}", y),
    }
}

함께 읽:

Structs

가드

match 가드를 이용해 각 행을 필터링할 수 있습니다.

#[allow(dead_code)]
enum Temperature {
    Celsius(i32),
    Fahrenheit(i32),
}

fn main() {
    let temperature = Temperature::Celsius(35);
    // ^ TODO try different values for `temperature`

    match temperature {
        Temperature::Celsius(t) if t > 30 => println!("{}C is above 30 Celsius", t),
        // The `if condition` part ^ is a guard
        Temperature::Celsius(t) => println!("{}C is equal to or below 30 Celsius", t),

        Temperature::Fahrenheit(t) if t > 86 => println!("{}F is above 86 Fahrenheit", t),
        Temperature::Fahrenheit(t) => println!("{}F is equal to or below 86 Fahrenheit", t),
    }
}

컴파일러는 매칭 표현식에서 가능한 모든 경우가 처리 가능하지 않으면 컴파일을 거부한다는 걸 알아두세요.

fn main() {
    let number: u8 = 4;

    match number {
        i if i == 0 => println!("Zero"),
        i if i > 0 => println!("Greater than zero"),
        // _ => unreachable!("Should never happen."),
        // TODO ^ uncomment to fix compilation
    }
}

함께 읽기:

Tuples Enums

바인딩

변수를 간접적으로 접근하는 것은 그 변수를 다시 할당하지 않으면 분기를 하지 못하게 하는 주 원인입니다. match@ 한정자를 이용해 해당 값을 변수로 재할당합니다:

// A function `age` which returns a `u32`.
fn age() -> u32 {
    15
}

fn main() {
    println!("Tell me what type of person you are");

    match age() {
        0             => println!("I haven't celebrated my first birthday yet"),
        // Could `match` 1 ..= 12 directly but then what age
        // would the child be? Instead, bind to `n` for the
        // sequence of 1 ..= 12. Now the age can be reported.
        n @ 1  ..= 12 => println!("I'm a child of age {:?}", n),
        n @ 13 ..= 19 => println!("I'm a teen of age {:?}", n),
        // Nothing bound. Return the result.
        n             => println!("I'm an old person of age {:?}", n),
    }
}

이 바인딩을 통해 Option과 같은 enum형을 해체하는 것도 가능합니다:

fn some_number() -> Option<u32> {
    Some(42)
}

fn main() {
    match some_number() {
        // Got `Some` variant, match if its value, bound to `n`,
        // is equal to 42.
        Some(n @ 42) => println!("The Answer: {}!", n),
        // Match any other number.
        Some(n)      => println!("Not interesting... {}", n),
        // Match anything else (`None` variant).
        _            => (),
    }
}

함께 읽기:

functions, enums and Option

if let

어떤 사용 환경에서는 match를 활용하는 게 애매한 경우가 있습니다. 예를 들어:

#![allow(unused)]
fn main() {
// Make `optional` of type `Option<i32>`
let optional = Some(7);

match optional {
    Some(i) => {
        println!("This is a really long string and `{:?}`", i);
        // ^ Needed 2 indentations just so we could destructure
        // `i` from the option.
    },
    _ => {},
    // ^ Required because `match` is exhaustive. Doesn't it seem
    // like wasted space?
};

}

이런 경우 if let을 사용하는 것이 더 간단하며, 추가로 실패했을 때의 처리도 가능합니다:

fn main() {
    // All have type `Option<i32>`
    let number = Some(7);
    let letter: Option<i32> = None;
    let emoticon: Option<i32> = None;

    // The `if let` construct reads: "if `let` destructures `number` into
    // `Some(i)`, evaluate the block (`{}`).
    if let Some(i) = number {
        println!("Matched {:?}!", i);
    }

    // If you need to specify a failure, use an else:
    if let Some(i) = letter {
        println!("Matched {:?}!", i);
    } else {
        // Destructure failed. Change to the failure case.
        println!("Didn't match a number. Let's go with a letter!");
    }

    // Provide an altered failing condition.
    let i_like_letters = false;

    if let Some(i) = emoticon {
        println!("Matched {:?}!", i);
    // Destructure failed. Evaluate an `else if` condition to see if the
    // alternate failure branch should be taken:
    } else if i_like_letters {
        println!("Didn't match a number. Let's go with a letter!");
    } else {
        // The condition evaluated false. This branch is the default:
        println!("I don't like letters. Let's go with an emoticon :)!");
    }
}

같은 방법으로 if let을 다른 열거형에 활용할 수도 있습니다:

// Our example enum
enum Foo {
    Bar,
    Baz,
    Qux(u32)
}

fn main() {
    // Create example variables
    let a = Foo::Bar;
    let b = Foo::Baz;
    let c = Foo::Qux(100);
    
    // Variable a matches Foo::Bar
    if let Foo::Bar = a {
        println!("a is foobar");
    }
    
    // Variable b does not match Foo::Bar
    // So this will print nothing
    if let Foo::Bar = b {
        println!("b is foobar");
    }
    
    // Variable c matches Foo::Qux which has a value
    // Similar to Some() in the previous example
    if let Foo::Qux(value) = c {
        println!("c is {}", value);
    }

    // Binding also works with `if let`
    if let Foo::Qux(value @ 100) = c {
        println!("c is one hundred");
    }
}

if let의 또다른 장점은 파라미터화되지 않은 열거형 멤버도 매칭할 수 있다는 것입니다. 그 열거형이 PartialEq를 구현하지도 포함하지도 않는 경우에도 가능하죠. 이런 경우 열거형 내부의 멤버에 접근할 수 없어 if Foo::Bar == a는 실패하지만, if let은 여전히 동작한다는 것입니다.

도전을 즐기시나요? 다음 예제를 if let을 이용해 고쳐봅시다:

// This enum purposely neither implements nor derives PartialEq.
// That is why comparing Foo::Bar == a fails below.
enum Foo {Bar}

fn main() {
    let a = Foo::Bar;

    // Variable a matches Foo::Bar
    if Foo::Bar == a {
    // ^-- this causes a compile-time error. Use `if let` instead.
        println!("a is foobar");
    }
}

함께 읽기:

enum, Option, and the RFC

let-else

🛈 안정화 버전: rust 1.65

🛈 다음과 같이 컴파일해 해당 에디션으로 컴파일할 수 있습니다:
rustc --edition=2021 main.rs

let-else를 사용하면, 일반 let처럼 실패할 수 있는 패턴을 매칭하고 해당 스코프에 변수를 할당할 수 있습니다. 패턴이 맞지 않는다면 break, return, panic!으로 발산하는 것도 가능합니다.

use std::str::FromStr;

fn get_count_item(s: &str) -> (u64, &str) {
    let mut it = s.split(' ');
    let (Some(count_str), Some(item)) = (it.next(), it.next()) else {
        // 매칭 실패, panic!
        panic!("Can't segment count item pair: '{s}'");
    };
    let Ok(count) = u64::from_str(count_str) else {
        panic!("Can't parse integer: '{count_str}'");
    };
    (count, item)
}

fn main() {
    assert_eq!(get_count_item("3 chairs"), (3, "chairs"));
}

이름 바인딩의 스코프가 matchif let-else 표현의 차이를 만듭니다. 처음을 match로 바인딩할 때는, 약간의 표현 반복과 외부 let을 이용해야 위 표현을 유사하게 구현할 수 있죠:

#![allow(unused)]
fn main() {
use std::str::FromStr;

fn get_count_item(s: &str) -> (u64, &str) {
    let mut it = s.split(' ');
    let (count_str, item) = match (it.next(), it.next()) {
        (Some(count_str), Some(item)) => (count_str, item),
        _ => panic!("Can't segment count item pair: '{s}'"),
    };
    let count = if let Ok(count) = u64::from_str(count_str) {
        count
    } else {
        panic!("Can't parse integer: '{count_str}'");
    };
    (count, item)
}

assert_eq!(get_count_item("3 chairs"), (3, "chairs"));
}

함께 읽기:

option, match, if let and the let-else RFC.

while let

if let과 비슷하게, while let을 이용해 짜증나는 match 시퀸스를 좀 더 통제 가능하게 바꿀 수 있습니다. i를 증가시키기 위한 다음 코드를 생각해 봅시다:

#![allow(unused)]
fn main() {
// Make `optional` of type `Option<i32>`
let mut optional = Some(0);

// Repeatedly try this test.
loop {
    match optional {
        // If `optional` destructures, evaluate the block.
        Some(i) => {
            if i > 9 {
                println!("Greater than 9, quit!");
                optional = None;
            } else {
                println!("`i` is `{:?}`. Try again.", i);
                optional = Some(i + 1);
            }
            // ^ Requires 3 indentations!
        },
        // 해체가 불가하면 반복문 밖으로 나갑니다:
        _ => { break; }
        // ^ 이 줄을 굳이 써야 할까요? 더 나은 방법이 분명 있을 겁니다!
    }
}
}

while let을 이용하면 이 과정이 좀 더 봐줄 만해집니다.

fn main() {
    // Make `optional` of type `Option<i32>`
    let mut optional = Some(0);

    // 이 블록은 `let`이 `optional`을 `Some(i)`로 해체할 수 있으면 내부 블록을
    // 실행하고, 아니라면 `break`합니다
    while let Some(i) = optional {
        if i > 9 {
            println!("Greater than 9, quit!");
            optional = None;
        } else {
            println!("`i` is `{:?}`. Try again.", i);
            optional = Some(i + 1);
        }
        // ^ Less rightward drift and doesn't require
        // explicitly handling the failing case.
    }
    // ^ `if let`은 `else`나 `else if`를 붙일 수 있었죠.
    // 반면 `while let`은 그렇지 않습니다.
}

함께 읽기:

enum, OptionRFC

Functions

Functions are declared using the fn keyword. Its arguments are type annotated, just like variables, and, if the function returns a value, the return type must be specified after an arrow ->.

The final expression in the function will be used as return value. Alternatively, the return statement can be used to return a value earlier from within the function, even from inside loops or if statements.

Let's rewrite FizzBuzz using functions!

// Unlike C/C++, there's no restriction on the order of function definitions
fn main() {
    // We can use this function here, and define it somewhere later
    fizzbuzz_to(100);
}

// Function that returns a boolean value
fn is_divisible_by(lhs: u32, rhs: u32) -> bool {
    // Corner case, early return
    if rhs == 0 {
        return false;
    }

    // This is an expression, the `return` keyword is not necessary here
    lhs % rhs == 0
}

// Functions that "don't" return a value, actually return the unit type `()`
fn fizzbuzz(n: u32) -> () {
    if is_divisible_by(n, 15) {
        println!("fizzbuzz");
    } else if is_divisible_by(n, 3) {
        println!("fizz");
    } else if is_divisible_by(n, 5) {
        println!("buzz");
    } else {
        println!("{}", n);
    }
}

// When a function returns `()`, the return type can be omitted from the
// signature
fn fizzbuzz_to(n: u32) {
    for n in 1..n + 1 {
        fizzbuzz(n);
    }
}

Methods

Methods are functions attached to objects. These methods have access to the data of the object and its other methods via the self keyword. Methods are defined under an impl block.

struct Point {
    x: f64,
    y: f64,
}

// Implementation block, all `Point` methods go in here
impl Point {
    // This is a static method
    // Static methods don't need to be called by an instance
    // These methods are generally used as constructors
    fn origin() -> Point {
        Point { x: 0.0, y: 0.0 }
    }

    // Another static method, taking two arguments:
    fn new(x: f64, y: f64) -> Point {
        Point { x: x, y: y }
    }
}

struct Rectangle {
    p1: Point,
    p2: Point,
}

impl Rectangle {
    // This is an instance method
    // `&self` is sugar for `self: &Self`, where `Self` is the type of the
    // caller object. In this case `Self` = `Rectangle`
    fn area(&self) -> f64 {
        // `self` gives access to the struct fields via the dot operator
        let Point { x: x1, y: y1 } = self.p1;
        let Point { x: x2, y: y2 } = self.p2;

        // `abs` is a `f64` method that returns the absolute value of the
        // caller
        ((x1 - x2) * (y1 - y2)).abs()
    }

    fn perimeter(&self) -> f64 {
        let Point { x: x1, y: y1 } = self.p1;
        let Point { x: x2, y: y2 } = self.p2;

        2.0 * ((x1 - x2).abs() + (y1 - y2).abs())
    }

    // This method requires the caller object to be mutable
    // `&mut self` desugars to `self: &mut Self`
    fn translate(&mut self, x: f64, y: f64) {
        self.p1.x += x;
        self.p2.x += x;

        self.p1.y += y;
        self.p2.y += y;
    }
}

// `Pair` owns resources: two heap allocated integers
struct Pair(Box<i32>, Box<i32>);

impl Pair {
    // This method "consumes" the resources of the caller object
    // `self` desugars to `self: Self`
    fn destroy(self) {
        // Destructure `self`
        let Pair(first, second) = self;

        println!("Destroying Pair({}, {})", first, second);

        // `first` and `second` go out of scope and get freed
    }
}

fn main() {
    let rectangle = Rectangle {
        // Static methods are called using double colons
        p1: Point::origin(),
        p2: Point::new(3.0, 4.0),
    };

    // Instance methods are called using the dot operator
    // Note that the first argument `&self` is implicitly passed, i.e.
    // `rectangle.perimeter()` === `Rectangle::perimeter(&rectangle)`
    println!("Rectangle perimeter: {}", rectangle.perimeter());
    println!("Rectangle area: {}", rectangle.area());

    let mut square = Rectangle {
        p1: Point::origin(),
        p2: Point::new(1.0, 1.0),
    };

    // Error! `rectangle` is immutable, but this method requires a mutable
    // object
    //rectangle.translate(1.0, 0.0);
    // TODO ^ Try uncommenting this line

    // Okay! Mutable objects can call mutable methods
    square.translate(1.0, 1.0);

    let pair = Pair(Box::new(1), Box::new(2));

    pair.destroy();

    // Error! Previous `destroy` call "consumed" `pair`
    //pair.destroy();
    // TODO ^ Try uncommenting this line
}

Closures

Closures are functions that can capture the enclosing environment. For example, a closure that captures the x variable:

|val| val + x

The syntax and capabilities of closures make them very convenient for on the fly usage. Calling a closure is exactly like calling a function. However, both input and return types can be inferred and input variable names must be specified.

Other characteristics of closures include:

  • using || instead of () around input variables.
  • optional body delimination ({}) for a single expression (mandatory otherwise).
  • the ability to capture the outer environment variables.
fn main() {
    // Increment via closures and functions.
    fn  function            (i: i32) -> i32 { i + 1 }

    // Closures are anonymous, here we are binding them to references
    // Annotation is identical to function annotation but is optional
    // as are the `{}` wrapping the body. These nameless functions
    // are assigned to appropriately named variables.
    let closure_annotated = |i: i32| -> i32 { i + 1 };
    let closure_inferred  = |i     |          i + 1  ;

    let i = 1;
    // Call the function and closures.
    println!("function: {}", function(i));
    println!("closure_annotated: {}", closure_annotated(i));
    println!("closure_inferred: {}", closure_inferred(i));

    // A closure taking no arguments which returns an `i32`.
    // The return type is inferred.
    let one = || 1;
    println!("closure returning one: {}", one());

}

Capturing

Closures are inherently flexible and will do what the functionality requires to make the closure work without annotation. This allows capturing to flexibly adapt to the use case, sometimes moving and sometimes borrowing. Closures can capture variables:

  • by reference: &T
  • by mutable reference: &mut T
  • by value: T

They preferentially capture variables by reference and only go lower when required.

fn main() {
    use std::mem;
    
    let color = String::from("green");

    // A closure to print `color` which immediately borrows (`&`) `color` and
    // stores the borrow and closure in the `print` variable. It will remain
    // borrowed until `print` is used the last time. 
    //
    // `println!` only requires arguments by immutable reference so it doesn't
    // impose anything more restrictive.
    let print = || println!("`color`: {}", color);

    // Call the closure using the borrow.
    print();

    // `color` can be borrowed immutably again, because the closure only holds
    // an immutable reference to `color`. 
    let _reborrow = &color;
    print();

    // A move or reborrow is allowed after the final use of `print`
    let _color_moved = color;


    let mut count = 0;
    // A closure to increment `count` could take either `&mut count` or `count`
    // but `&mut count` is less restrictive so it takes that. Immediately
    // borrows `count`.
    //
    // A `mut` is required on `inc` because a `&mut` is stored inside. Thus,
    // calling the closure mutates the closure which requires a `mut`.
    let mut inc = || {
        count += 1;
        println!("`count`: {}", count);
    };

    // Call the closure using a mutable borrow.
    inc();

    // The closure still mutably borrows `count` because it is called later.
    // An attempt to reborrow will lead to an error.
    // let _reborrow = &count; 
    // ^ TODO: try uncommenting this line.
    inc();

    // The closure no longer needs to borrow `&mut count`. Therefore, it is
    // possible to reborrow without an error
    let _count_reborrowed = &mut count; 

    
    // A non-copy type.
    let movable = Box::new(3);

    // `mem::drop` requires `T` so this must take by value. A copy type
    // would copy into the closure leaving the original untouched.
    // A non-copy must move and so `movable` immediately moves into
    // the closure.
    let consume = || {
        println!("`movable`: {:?}", movable);
        mem::drop(movable);
    };

    // `consume` consumes the variable so this can only be called once.
    consume();
    // consume();
    // ^ TODO: Try uncommenting this line.
}

Using move before vertical pipes forces closure to take ownership of captured variables:

fn main() {
    // `Vec` has non-copy semantics.
    let haystack = vec![1, 2, 3];

    let contains = move |needle| haystack.contains(needle);

    println!("{}", contains(&1));
    println!("{}", contains(&4));

    // println!("There're {} elements in vec", haystack.len());
    // ^ Uncommenting above line will result in compile-time error
    // because borrow checker doesn't allow re-using variable after it
    // has been moved.
    
    // Removing `move` from closure's signature will cause closure
    // to borrow _haystack_ variable immutably, hence _haystack_ is still
    // available and uncommenting above line will not cause an error.
}

See also:

Box and std::mem::drop

As input parameters

While Rust chooses how to capture variables on the fly mostly without type annotation, this ambiguity is not allowed when writing functions. When taking a closure as an input parameter, the closure's complete type must be annotated using one of a few traits. In order of decreasing restriction, they are:

  • Fn: the closure captures by reference (&T)
  • FnMut: the closure captures by mutable reference (&mut T)
  • FnOnce: the closure captures by value (T)

On a variable-by-variable basis, the compiler will capture variables in the least restrictive manner possible.

For instance, consider a parameter annotated as FnOnce. This specifies that the closure may capture by &T, &mut T, or T, but the compiler will ultimately choose based on how the captured variables are used in the closure.

This is because if a move is possible, then any type of borrow should also be possible. Note that the reverse is not true. If the parameter is annotated as Fn, then capturing variables by &mut T or T are not allowed.

In the following example, try swapping the usage of Fn, FnMut, and FnOnce to see what happens:

// A function which takes a closure as an argument and calls it.
// <F> denotes that F is a "Generic type parameter"
fn apply<F>(f: F) where
    // The closure takes no input and returns nothing.
    F: FnOnce() {
    // ^ TODO: Try changing this to `Fn` or `FnMut`.

    f();
}

// A function which takes a closure and returns an `i32`.
fn apply_to_3<F>(f: F) -> i32 where
    // The closure takes an `i32` and returns an `i32`.
    F: Fn(i32) -> i32 {

    f(3)
}

fn main() {
    use std::mem;

    let greeting = "hello";
    // A non-copy type.
    // `to_owned` creates owned data from borrowed one
    let mut farewell = "goodbye".to_owned();

    // Capture 2 variables: `greeting` by reference and
    // `farewell` by value.
    let diary = || {
        // `greeting` is by reference: requires `Fn`.
        println!("I said {}.", greeting);

        // Mutation forces `farewell` to be captured by
        // mutable reference. Now requires `FnMut`.
        farewell.push_str("!!!");
        println!("Then I screamed {}.", farewell);
        println!("Now I can sleep. zzzzz");

        // Manually calling drop forces `farewell` to
        // be captured by value. Now requires `FnOnce`.
        mem::drop(farewell);
    };

    // Call the function which applies the closure.
    apply(diary);

    // `double` satisfies `apply_to_3`'s trait bound
    let double = |x| 2 * x;

    println!("3 doubled: {}", apply_to_3(double));
}

See also:

std::mem::drop, Fn, FnMut, Generics, where and FnOnce

Type anonymity

Closures succinctly capture variables from enclosing scopes. Does this have any consequences? It surely does. Observe how using a closure as a function parameter requires generics, which is necessary because of how they are defined:

#![allow(unused)]
fn main() {
// `F` must be generic.
fn apply<F>(f: F) where
    F: FnOnce() {
    f();
}
}

When a closure is defined, the compiler implicitly creates a new anonymous structure to store the captured variables inside, meanwhile implementing the functionality via one of the traits: Fn, FnMut, or FnOnce for this unknown type. This type is assigned to the variable which is stored until calling.

Since this new type is of unknown type, any usage in a function will require generics. However, an unbounded type parameter <T> would still be ambiguous and not be allowed. Thus, bounding by one of the traits: Fn, FnMut, or FnOnce (which it implements) is sufficient to specify its type.

// `F` must implement `Fn` for a closure which takes no
// inputs and returns nothing - exactly what is required
// for `print`.
fn apply<F>(f: F) where
    F: Fn() {
    f();
}

fn main() {
    let x = 7;

    // Capture `x` into an anonymous type and implement
    // `Fn` for it. Store it in `print`.
    let print = || println!("{}", x);

    apply(print);
}

See also:

A thorough analysis, Fn, FnMut, and FnOnce

Input functions

Since closures may be used as arguments, you might wonder if the same can be said about functions. And indeed they can! If you declare a function that takes a closure as parameter, then any function that satisfies the trait bound of that closure can be passed as a parameter.

// Define a function which takes a generic `F` argument
// bounded by `Fn`, and calls it
fn call_me<F: Fn()>(f: F) {
    f();
}

// Define a wrapper function satisfying the `Fn` bound
fn function() {
    println!("I'm a function!");
}

fn main() {
    // Define a closure satisfying the `Fn` bound
    let closure = || println!("I'm a closure!");

    call_me(closure);
    call_me(function);
}

As an additional note, the Fn, FnMut, and FnOnce traits dictate how a closure captures variables from the enclosing scope.

See also:

Fn, FnMut, and FnOnce

As output parameters

Closures as input parameters are possible, so returning closures as output parameters should also be possible. However, anonymous closure types are, by definition, unknown, so we have to use impl Trait to return them.

The valid traits for returning a closure are:

  • Fn
  • FnMut
  • FnOnce

Beyond this, the move keyword must be used, which signals that all captures occur by value. This is required because any captures by reference would be dropped as soon as the function exited, leaving invalid references in the closure.

fn create_fn() -> impl Fn() {
    let text = "Fn".to_owned();

    move || println!("This is a: {}", text)
}

fn create_fnmut() -> impl FnMut() {
    let text = "FnMut".to_owned();

    move || println!("This is a: {}", text)
}

fn create_fnonce() -> impl FnOnce() {
    let text = "FnOnce".to_owned();

    move || println!("This is a: {}", text)
}

fn main() {
    let fn_plain = create_fn();
    let mut fn_mut = create_fnmut();
    let fn_once = create_fnonce();

    fn_plain();
    fn_mut();
    fn_once();
}

See also:

Fn, FnMut, Generics and impl Trait.

Examples in std

This section contains a few examples of using closures from the std library.

Iterator::any

Iterator::any is a function which when passed an iterator, will return true if any element satisfies the predicate. Otherwise false. Its signature:

pub trait Iterator {
    // The type being iterated over.
    type Item;

    // `any` takes `&mut self` meaning the caller may be borrowed
    // and modified, but not consumed.
    fn any<F>(&mut self, f: F) -> bool where
        // `FnMut` meaning any captured variable may at most be
        // modified, not consumed. `Self::Item` states it takes
        // arguments to the closure by value.
        F: FnMut(Self::Item) -> bool {}
}
fn main() {
    let vec1 = vec![1, 2, 3];
    let vec2 = vec![4, 5, 6];

    // `iter()` for vecs yields `&i32`. Destructure to `i32`.
    println!("2 in vec1: {}", vec1.iter()     .any(|&x| x == 2));
    // `into_iter()` for vecs yields `i32`. No destructuring required.
    println!("2 in vec2: {}", vec2.into_iter().any(| x| x == 2));

    let array1 = [1, 2, 3];
    let array2 = [4, 5, 6];

    // `iter()` for arrays yields `&i32`.
    println!("2 in array1: {}", array1.iter()     .any(|&x| x == 2));
    // `into_iter()` for arrays unusually yields `&i32`.
    println!("2 in array2: {}", array2.into_iter().any(|&x| x == 2));
}

See also:

std::iter::Iterator::any

Searching through iterators

Iterator::find is a function which iterates over an iterator and searches for the first value which satisfies some condition. If none of the values satisfy the condition, it returns None. Its signature:

pub trait Iterator {
    // The type being iterated over.
    type Item;

    // `find` takes `&mut self` meaning the caller may be borrowed
    // and modified, but not consumed.
    fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where
        // `FnMut` meaning any captured variable may at most be
        // modified, not consumed. `&Self::Item` states it takes
        // arguments to the closure by reference.
        P: FnMut(&Self::Item) -> bool {}
}
fn main() {
    let vec1 = vec![1, 2, 3];
    let vec2 = vec![4, 5, 6];

    // `iter()` for vecs yields `&i32`.
    let mut iter = vec1.iter();
    // `into_iter()` for vecs yields `i32`.
    let mut into_iter = vec2.into_iter();

    // `iter()` for vecs yields `&i32`, and we want to reference one of its
    // items, so we have to destructure `&&i32` to `i32`
    println!("Find 2 in vec1: {:?}", iter     .find(|&&x| x == 2));
    // `into_iter()` for vecs yields `i32`, and we want to reference one of
    // its items, so we have to destructure `&i32` to `i32`
    println!("Find 2 in vec2: {:?}", into_iter.find(| &x| x == 2));

    let array1 = [1, 2, 3];
    let array2 = [4, 5, 6];

    // `iter()` for arrays yields `&i32`
    println!("Find 2 in array1: {:?}", array1.iter()     .find(|&&x| x == 2));
    // `into_iter()` for arrays unusually yields `&i32`
    println!("Find 2 in array2: {:?}", array2.into_iter().find(|&&x| x == 2));
}

Iterator::find gives you a reference to the item. But if you want the index of the item, use Iterator::position.

fn main() {
    let vec = vec![1, 9, 3, 3, 13, 2];

    let index_of_first_even_number = vec.iter().position(|x| x % 2 == 0);
    assert_eq!(index_of_first_even_number, Some(5));
    
    
    let index_of_first_negative_number = vec.iter().position(|x| x < &0);
    assert_eq!(index_of_first_negative_number, None);
}

See also:

std::iter::Iterator::find

std::iter::Iterator::find_map

std::iter::Iterator::position

std::iter::Iterator::rposition

Higher Order Functions

Rust provides Higher Order Functions (HOF). These are functions that take one or more functions and/or produce a more useful function. HOFs and lazy iterators give Rust its functional flavor.

fn is_odd(n: u32) -> bool {
    n % 2 == 1
}

fn main() {
    println!("Find the sum of all the squared odd numbers under 1000");
    let upper = 1000;

    // Imperative approach
    // Declare accumulator variable
    let mut acc = 0;
    // Iterate: 0, 1, 2, ... to infinity
    for n in 0.. {
        // Square the number
        let n_squared = n * n;

        if n_squared >= upper {
            // Break loop if exceeded the upper limit
            break;
        } else if is_odd(n_squared) {
            // Accumulate value, if it's odd
            acc += n_squared;
        }
    }
    println!("imperative style: {}", acc);

    // Functional approach
    let sum_of_squared_odd_numbers: u32 =
        (0..).map(|n| n * n)                             // All natural numbers squared
             .take_while(|&n_squared| n_squared < upper) // Below upper limit
             .filter(|&n_squared| is_odd(n_squared))     // That are odd
             .fold(0, |acc, n_squared| acc + n_squared); // Sum them
    println!("functional style: {}", sum_of_squared_odd_numbers);
}

Option and Iterator implement their fair share of HOFs.

Diverging functions

Diverging functions never return. They are marked using !, which is an empty type.

#![allow(unused)]
fn main() {
fn foo() -> ! {
    panic!("This call never returns.");
}
}

As opposed to all the other types, this one cannot be instantiated, because the set of all possible values this type can have is empty. Note that, it is different from the () type, which has exactly one possible value.

For example, this function returns as usual, although there is no information in the return value.

fn some_fn() {
    ()
}

fn main() {
    let a: () = some_fn();
    println!("This function returns and you can see this line.")
}

As opposed to this function, which will never return the control back to the caller.

#![feature(never_type)]
// TODO : ^ Remove this on Rust 2024

fn main() {
    let x: ! = panic!("This call never returns.");
    println!("You will never see this line!");
}

Although this might seem like an abstract concept, it is in fact very useful and often handy. The main advantage of this type is that it can be cast to any other one and therefore used at places where an exact type is required, for instance in match branches. This allows us to write code like this:

fn main() {
    fn sum_odd_numbers(up_to: u32) -> u32 {
        let mut acc = 0;
        for i in 0..up_to {
            // Notice that the return type of this match expression must be u32
            // because of the type of the "addition" variable.
            let addition: u32 = match i%2 == 1 {
                // The "i" variable is of type u32, which is perfectly fine.
                true => i,
                // On the other hand, the "continue" expression does not return
                // u32, but it is still fine, because it never returns and therefore
                // does not violate the type requirements of the match expression.
                false => continue,
            };
            acc += addition;
        }
        acc
    }
    println!("Sum of odd numbers up to 9 (excluding): {}", sum_odd_numbers(9));
}

It is also the return type of functions that loop forever (e.g. loop {}) like network servers or functions that terminates the process (e.g. exit()).

Modules

Rust provides a powerful module system that can be used to hierarchically split code in logical units (modules), and manage visibility (public/private) between them.

A module is a collection of items: functions, structs, traits, impl blocks, and even other modules.

Visibility

By default, the items in a module have private visibility, but this can be overridden with the pub modifier. Only the public items of a module can be accessed from outside the module scope.

// A module named `my_mod`
mod my_mod {
    // Items in modules default to private visibility.
    fn private_function() {
        println!("called `my_mod::private_function()`");
    }

    // Use the `pub` modifier to override default visibility.
    pub fn function() {
        println!("called `my_mod::function()`");
    }

    // Items can access other items in the same module,
    // even when private.
    pub fn indirect_access() {
        print!("called `my_mod::indirect_access()`, that\n> ");
        private_function();
    }

    // Modules can also be nested
    pub mod nested {
        pub fn function() {
            println!("called `my_mod::nested::function()`");
        }

        #[allow(dead_code)]
        fn private_function() {
            println!("called `my_mod::nested::private_function()`");
        }

        // Functions declared using `pub(in path)` syntax are only visible
        // within the given path. `path` must be a parent or ancestor module
        pub(in crate::my_mod) fn public_function_in_my_mod() {
            print!("called `my_mod::nested::public_function_in_my_mod()`, that\n> ");
            public_function_in_nested();
        }

        // Functions declared using `pub(self)` syntax are only visible within
        // the current module, which is the same as leaving them private
        pub(self) fn public_function_in_nested() {
            println!("called `my_mod::nested::public_function_in_nested()`");
        }

        // Functions declared using `pub(super)` syntax are only visible within
        // the parent module
        pub(super) fn public_function_in_super_mod() {
            println!("called `my_mod::nested::public_function_in_super_mod()`");
        }
    }

    pub fn call_public_function_in_my_mod() {
        print!("called `my_mod::call_public_function_in_my_mod()`, that\n> ");
        nested::public_function_in_my_mod();
        print!("> ");
        nested::public_function_in_super_mod();
    }

    // pub(crate) makes functions visible only within the current crate
    pub(crate) fn public_function_in_crate() {
        println!("called `my_mod::public_function_in_crate()`");
    }

    // Nested modules follow the same rules for visibility
    mod private_nested {
        #[allow(dead_code)]
        pub fn function() {
            println!("called `my_mod::private_nested::function()`");
        }

        // Private parent items will still restrict the visibility of a child item,
        // even if it is declared as visible within a bigger scope.
        #[allow(dead_code)]
        pub(crate) fn restricted_function() {
            println!("called `my_mod::private_nested::restricted_function()`");
        }
    }
}

fn function() {
    println!("called `function()`");
}

fn main() {
    // Modules allow disambiguation between items that have the same name.
    function();
    my_mod::function();

    // Public items, including those inside nested modules, can be
    // accessed from outside the parent module.
    my_mod::indirect_access();
    my_mod::nested::function();
    my_mod::call_public_function_in_my_mod();

    // pub(crate) items can be called from anywhere in the same crate
    my_mod::public_function_in_crate();

    // pub(in path) items can only be called from within the module specified
    // Error! function `public_function_in_my_mod` is private
    //my_mod::nested::public_function_in_my_mod();
    // TODO ^ Try uncommenting this line

    // Private items of a module cannot be directly accessed, even if
    // nested in a public module:

    // Error! `private_function` is private
    //my_mod::private_function();
    // TODO ^ Try uncommenting this line

    // Error! `private_function` is private
    //my_mod::nested::private_function();
    // TODO ^ Try uncommenting this line

    // Error! `private_nested` is a private module
    //my_mod::private_nested::function();
    // TODO ^ Try uncommenting this line

    // Error! `private_nested` is a private module
    //my_mod::private_nested::restricted_function();
    // TODO ^ Try uncommenting this line
}

Struct visibility

Structs have an extra level of visibility with their fields. The visibility defaults to private, and can be overridden with the pub modifier. This visibility only matters when a struct is accessed from outside the module where it is defined, and has the goal of hiding information (encapsulation).

mod my {
    // A public struct with a public field of generic type `T`
    pub struct OpenBox<T> {
        pub contents: T,
    }

    // A public struct with a private field of generic type `T`
    #[allow(dead_code)]
    pub struct ClosedBox<T> {
        contents: T,
    }

    impl<T> ClosedBox<T> {
        // A public constructor method
        pub fn new(contents: T) -> ClosedBox<T> {
            ClosedBox {
                contents: contents,
            }
        }
    }
}

fn main() {
    // Public structs with public fields can be constructed as usual
    let open_box = my::OpenBox { contents: "public information" };

    // and their fields can be normally accessed.
    println!("The open box contains: {}", open_box.contents);

    // Public structs with private fields cannot be constructed using field names.
    // Error! `ClosedBox` has private fields
    //let closed_box = my::ClosedBox { contents: "classified information" };
    // TODO ^ Try uncommenting this line

    // However, structs with private fields can be created using
    // public constructors
    let _closed_box = my::ClosedBox::new("classified information");

    // and the private fields of a public struct cannot be accessed.
    // Error! The `contents` field is private
    //println!("The closed box contains: {}", _closed_box.contents);
    // TODO ^ Try uncommenting this line
}

See also:

generics and methods

The use declaration

The use declaration can be used to bind a full path to a new name, for easier access. It is often used like this:

use crate::deeply::nested::{
    my_first_function,
    my_second_function,
    AndATraitType
};

fn main() {
    my_first_function();
}

You can use the as keyword to bind imports to a different name:

// Bind the `deeply::nested::function` path to `other_function`.
use deeply::nested::function as other_function;

fn function() {
    println!("called `function()`");
}

mod deeply {
    pub mod nested {
        pub fn function() {
            println!("called `deeply::nested::function()`");
        }
    }
}

fn main() {
    // Easier access to `deeply::nested::function`
    other_function();

    println!("Entering block");
    {
        // This is equivalent to `use deeply::nested::function as function`.
        // This `function()` will shadow the outer one.
        use crate::deeply::nested::function;

        // `use` bindings have a local scope. In this case, the
        // shadowing of `function()` is only in this block.
        function();

        println!("Leaving block");
    }

    function();
}

super and self

The super and self keywords can be used in the path to remove ambiguity when accessing items and to prevent unnecessary hardcoding of paths.

fn function() {
    println!("called `function()`");
}

mod cool {
    pub fn function() {
        println!("called `cool::function()`");
    }
}

mod my {
    fn function() {
        println!("called `my::function()`");
    }
    
    mod cool {
        pub fn function() {
            println!("called `my::cool::function()`");
        }
    }
    
    pub fn indirect_call() {
        // Let's access all the functions named `function` from this scope!
        print!("called `my::indirect_call()`, that\n> ");
        
        // The `self` keyword refers to the current module scope - in this case `my`.
        // Calling `self::function()` and calling `function()` directly both give
        // the same result, because they refer to the same function.
        self::function();
        function();
        
        // We can also use `self` to access another module inside `my`:
        self::cool::function();
        
        // The `super` keyword refers to the parent scope (outside the `my` module).
        super::function();
        
        // This will bind to the `cool::function` in the *crate* scope.
        // In this case the crate scope is the outermost scope.
        {
            use crate::cool::function as root_function;
            root_function();
        }
    }
}

fn main() {
    my::indirect_call();
}

File hierarchy

Modules can be mapped to a file/directory hierarchy. Let's break down the visibility example in files:

$ tree .
.
|-- my
|   |-- inaccessible.rs
|   |-- mod.rs
|   `-- nested.rs
`-- split.rs

In split.rs:

// This declaration will look for a file named `my.rs` or `my/mod.rs` and will
// insert its contents inside a module named `my` under this scope
mod my;

fn function() {
    println!("called `function()`");
}

fn main() {
    my::function();

    function();

    my::indirect_access();

    my::nested::function();
}

In my/mod.rs:

// Similarly `mod inaccessible` and `mod nested` will locate the `nested.rs`
// and `inaccessible.rs` files and insert them here under their respective
// modules
mod inaccessible;
pub mod nested;

pub fn function() {
    println!("called `my::function()`");
}

fn private_function() {
    println!("called `my::private_function()`");
}

pub fn indirect_access() {
    print!("called `my::indirect_access()`, that\n> ");

    private_function();
}

In my/nested.rs:

pub fn function() {
    println!("called `my::nested::function()`");
}

#[allow(dead_code)]
fn private_function() {
    println!("called `my::nested::private_function()`");
}

In my/inaccessible.rs:

#[allow(dead_code)]
pub fn public_function() {
    println!("called `my::inaccessible::public_function()`");
}

Let's check that things still work as before:

$ rustc split.rs && ./split
called `my::function()`
called `function()`
called `my::indirect_access()`, that
> called `my::private_function()`
called `my::nested::function()`

Crates

A crate is a compilation unit in Rust. Whenever rustc some_file.rs is called, some_file.rs is treated as the crate file. If some_file.rs has mod declarations in it, then the contents of the module files would be inserted in places where mod declarations in the crate file are found, before running the compiler over it. In other words, modules do not get compiled individually, only crates get compiled.

A crate can be compiled into a binary or into a library. By default, rustc will produce a binary from a crate. This behavior can be overridden by passing the --crate-type flag to lib.

Creating a Library

Let's create a library, and then see how to link it to another crate.

pub fn public_function() {
    println!("called rary's `public_function()`");
}

fn private_function() {
    println!("called rary's `private_function()`");
}

pub fn indirect_access() {
    print!("called rary's `indirect_access()`, that\n> ");

    private_function();
}
$ rustc --crate-type=lib rary.rs
$ ls lib*
library.rlib

Libraries get prefixed with "lib", and by default they get named after their crate file, but this default name can be overridden by passing the --crate-name option to rustc or by using the crate_name attribute.

Using a Library

To link a crate to this new library you may use rustc's --extern flag. All of its items will then be imported under a module named the same as the library. This module generally behaves the same way as any other module.

// extern crate rary; // May be required for Rust 2015 edition or earlier

fn main() {
    rary::public_function();

    // Error! `private_function` is private
    //rary::private_function();

    rary::indirect_access();
}
# Where library.rlib is the path to the compiled library, assumed that it's
# in the same directory here:
$ rustc executable.rs --extern rary=library.rlib --edition=2018 && ./executable 
called rary's `public_function()`
called rary's `indirect_access()`, that
> called rary's `private_function()`

Cargo

cargo is the official Rust package management tool. It has lots of really useful features to improve code quality and developer velocity! These include

  • Dependency management and integration with crates.io (the official Rust package registry)
  • Awareness of unit tests
  • Awareness of benchmarks

This chapter will go through some quick basics, but you can find the comprehensive docs in The Cargo Book.

Dependencies

Most programs have dependencies on some libraries. If you have ever managed dependencies by hand, you know how much of a pain this can be. Luckily, the Rust ecosystem comes standard with cargo! cargo can manage dependencies for a project.

To create a new Rust project,

# A binary
cargo new foo

# OR A library
cargo new --lib foo

For the rest of this chapter, let's assume we are making a binary, rather than a library, but all of the concepts are the same.

After the above commands, you should see a file hierarchy like this:

foo
├── Cargo.toml
└── src
    └── main.rs

The main.rs is the root source file for your new project -- nothing new there. The Cargo.toml is the config file for cargo for this project (foo). If you look inside it, you should see something like this:

[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]

[dependencies]

The name field under [package] determines the name of the project. This is used by crates.io if you publish the crate (more later). It is also the name of the output binary when you compile.

The version field is a crate version number using Semantic Versioning.

The authors field is a list of authors used when publishing the crate.

The [dependencies] section lets you add dependencies for your project.

For example, suppose that we want our program to have a great CLI. You can find lots of great packages on crates.io (the official Rust package registry). One popular choice is clap. As of this writing, the most recent published version of clap is 2.27.1. To add a dependency to our program, we can simply add the following to our Cargo.toml under [dependencies]: clap = "2.27.1". And of course, extern crate clap in main.rs, just like normal. And that's it! You can start using clap in your program.

cargo also supports other types of dependencies. Here is just a small sampling:

[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]

[dependencies]
clap = "2.27.1" # from crates.io
rand = { git = "https://github.com/rust-lang-nursery/rand" } # from online repo
bar = { path = "../bar" } # from a path in the local filesystem

cargo is more than a dependency manager. All of the available configuration options are listed in the format specification of Cargo.toml.

To build our project we can execute cargo build anywhere in the project directory (including subdirectories!). We can also do cargo run to build and run. Notice that these commands will resolve all dependencies, download crates if needed, and build everything, including your crate. (Note that it only rebuilds what it has not already built, similar to make).

Voila! That's all there is to it!

Conventions

In the previous chapter, we saw the following directory hierarchy:

foo
├── Cargo.toml
└── src
    └── main.rs

Suppose that we wanted to have two binaries in the same project, though. What then?

It turns out that cargo supports this. The default binary name is main, as we saw before, but you can add additional binaries by placing them in a bin/ directory:

foo
├── Cargo.toml
└── src
    ├── main.rs
    └── bin
        └── my_other_bin.rs

To tell cargo to compile or run this binary as opposed to the default or other binaries, we just pass cargo the --bin my_other_bin flag, where my_other_bin is the name of the binary we want to work with.

In addition to extra binaries, cargo supports more features such as benchmarks, tests, and examples.

In the next chapter, we will look more closely at tests.

Testing

As we know testing is integral to any piece of software! Rust has first-class support for unit and integration testing (see this chapter in TRPL).

From the testing chapters linked above, we see how to write unit tests and integration tests. Organizationally, we can place unit tests in the modules they test and integration tests in their own tests/ directory:

foo
├── Cargo.toml
├── src
│   └── main.rs
└── tests
    ├── my_test.rs
    └── my_other_test.rs

Each file in tests is a separate integration test.

cargo naturally provides an easy way to run all of your tests!

$ cargo test

You should see output like this:

$ cargo test
   Compiling blah v0.1.0 (file:///nobackup/blah)
    Finished dev [unoptimized + debuginfo] target(s) in 0.89 secs
     Running target/debug/deps/blah-d3b32b97275ec472

running 3 tests
test test_bar ... ok
test test_baz ... ok
test test_foo_bar ... ok
test test_foo ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

You can also run tests whose name matches a pattern:

$ cargo test test_foo
$ cargo test test_foo
   Compiling blah v0.1.0 (file:///nobackup/blah)
    Finished dev [unoptimized + debuginfo] target(s) in 0.35 secs
     Running target/debug/deps/blah-d3b32b97275ec472

running 2 tests
test test_foo ... ok
test test_foo_bar ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out

One word of caution: Cargo may run multiple tests concurrently, so make sure that they don't race with each other. For example, if they all output to a file, you should make them write to different files.

Build Scripts

Sometimes a normal build from cargo is not enough. Perhaps your crate needs some pre-requisites before cargo will successfully compile, things like code generation, or some native code that needs to be compiled. To solve this problem we have build scripts that Cargo can run.

To add a build script to your package it can either be specified in the Cargo.toml as follows:

[package]
...
build = "build.rs"

Otherwise Cargo will look for a build.rs file in the project directory by default.

How to use a build script

The build script is simply another Rust file that will be compiled and invoked prior to compiling anything else in the package. Hence it can be used to fulfill pre-requisites of your crate.

Cargo provides the script with inputs via environment variables specified here that can be used.

The script provides output via stdout. All lines printed are written to target/debug/build/<pkg>/output. Further, lines prefixed with cargo: will be interpreted by Cargo directly and hence can be used to define parameters for the package's compilation.

For further specification and examples have a read of the Cargo specification.

속성

속성은 모듈, 크레이트 또는 아이템에 붙이는 메타데이터로, 다음과 같은 상황에 사용할 수 있습니다:

속성은 #[outer_attribute]#![inner_attribute]와 같이 표기하며, 각각에 따라 적용하는 지점이 다릅니다.

  • #[outer_attribute]는 이어지는 항목에 바로 적용됩니다. 예를 들어 함수, 모듈 선언, 상수, 구조체, enum 등이 있습니다. 구조체 Rectangle에 attribute #[derive(debug)]가 적용되는 예를 들어보겠습니다:

    #![allow(unused)]
    fn main() {
    #[derive(Debug)]
    struct Rectangle {
        width: u32,
        height: u32,
    }
    }
  • #![inner_attribute]는 이를 포함하는 아이템 (보통 모듈 또는 크레이트) 전체에 적용됩니다. 즉, 이는 배치된 모든 스코프에 적용하는 것으로 해석됩니다. main.rs에 배치함으로서 #![allow(unused_variables)]가 크레이트 전체에 적용되는 예를 들어보겠습니다:

    #![allow(unused_variables)]
    
    fn main() {
        let x = 3; // This would normally warn about an unused variable.
    }

속성은 여러 방식으로 인수를 받을 수 있습니다:

  • #[attribute = "value"]
  • #[attribute(key = "value")]
  • #[attribute(value)]

인수를 여러 개 받을 수도, 여러 줄에 걸쳐 받을 수도 있습니다:

#[attribute(value, value2)]


#[attribute(value, value2, value3,
            value4, value5)]

dead_code

러스트 컴파일러는 dead_code lint를 통해 사용되지 않은 함수에 대해 경고를 표시합니다. 속성을 이용해 이를 비활성화할 수 있습니다.

fn used_function() {}

// `#[allow(dead_code)]` is an attribute that disables the `dead_code` lint
#[allow(dead_code)]
fn unused_function() {}

fn noisy_unused_function() {}
// FIXME ^ Add an attribute to suppress the warning

fn main() {
    used_function();
}

실제 프로그램에서는 이러한 "죽은 코드"를 없애야 합니다. 이 예시들에서는 예시들의 대화형 특성에 의해 여러 "죽은 코드"를 허용하고 있습니다.

Crates

The crate_type attribute can be used to tell the compiler whether a crate is a binary or a library (and even which type of library), and the crate_name attribute can be used to set the name of the crate.

However, it is important to note that both the crate_type and crate_name attributes have no effect whatsoever when using Cargo, the Rust package manager. Since Cargo is used for the majority of Rust projects, this means real-world uses of crate_type and crate_name are relatively limited.

// This crate is a library
#![crate_type = "lib"]
// The library is named "rary"
#![crate_name = "rary"]

pub fn public_function() {
    println!("called rary's `public_function()`");
}

fn private_function() {
    println!("called rary's `private_function()`");
}

pub fn indirect_access() {
    print!("called rary's `indirect_access()`, that\n> ");

    private_function();
}

When the crate_type attribute is used, we no longer need to pass the --crate-type flag to rustc.

$ rustc lib.rs
$ ls lib*
library.rlib

cfg

Configuration conditional checks are possible through two different operators:

  • the cfg attribute: #[cfg(...)] in attribute position
  • the cfg! macro: cfg!(...) in boolean expressions

While the former enables conditional compilation, the latter conditionally evaluates to true or false literals allowing for checks at run-time. Both utilize identical argument syntax.

// This function only gets compiled if the target OS is linux
#[cfg(target_os = "linux")]
fn are_you_on_linux() {
    println!("You are running linux!");
}

// And this function only gets compiled if the target OS is *not* linux
#[cfg(not(target_os = "linux"))]
fn are_you_on_linux() {
    println!("You are *not* running linux!");
}

fn main() {
    are_you_on_linux();

    println!("Are you sure?");
    if cfg!(target_os = "linux") {
        println!("Yes. It's definitely linux!");
    } else {
        println!("Yes. It's definitely *not* linux!");
    }
}

See also:

the reference, cfg!, and macros.

Custom

Some conditionals like target_os are implicitly provided by rustc, but custom conditionals must be passed to rustc using the --cfg flag.

#[cfg(some_condition)]
fn conditional_function() {
    println!("condition met!");
}

fn main() {
    conditional_function();
}

Try to run this to see what happens without the custom cfg flag.

With the custom cfg flag:

$ rustc --cfg some_condition custom.rs && ./custom
condition met!

Generics

Generics is the topic of generalizing types and functionalities to broader cases. This is extremely useful for reducing code duplication in many ways, but can call for rather involving syntax. Namely, being generic requires taking great care to specify over which types a generic type is actually considered valid. The simplest and most common use of generics is for type parameters.

A type parameter is specified as generic by the use of angle brackets and upper camel case: <Aaa, Bbb, ...>. "Generic type parameters" are typically represented as <T>. In Rust, "generic" also describes anything that accepts one or more generic type parameters <T>. Any type specified as a generic type parameter is generic, and everything else is concrete (non-generic).

For example, defining a generic function named foo that takes an argument T of any type:

fn foo<T>(arg: T) { ... }

Because T has been specified as a generic type parameter using <T>, it is considered generic when used here as (arg: T). This is the case even if T has previously been defined as a struct.

This example shows some of the syntax in action:

// A concrete type `A`.
struct A;

// In defining the type `Single`, the first use of `A` is not preceded by `<A>`.
// Therefore, `Single` is a concrete type, and `A` is defined as above.
struct Single(A);
//            ^ Here is `Single`s first use of the type `A`.

// Here, `<T>` precedes the first use of `T`, so `SingleGen` is a generic type.
// Because the type parameter `T` is generic, it could be anything, including
// the concrete type `A` defined at the top.
struct SingleGen<T>(T);

fn main() {
    // `Single` is concrete and explicitly takes `A`.
    let _s = Single(A);
    
    // Create a variable `_char` of type `SingleGen<char>`
    // and give it the value `SingleGen('a')`.
    // Here, `SingleGen` has a type parameter explicitly specified.
    let _char: SingleGen<char> = SingleGen('a');

    // `SingleGen` can also have a type parameter implicitly specified:
    let _t    = SingleGen(A); // Uses `A` defined at the top.
    let _i32  = SingleGen(6); // Uses `i32`.
    let _char = SingleGen('a'); // Uses `char`.
}

See also:

structs

Functions

The same set of rules can be applied to functions: a type T becomes generic when preceded by <T>.

Using generic functions sometimes requires explicitly specifying type parameters. This may be the case if the function is called where the return type is generic, or if the compiler doesn't have enough information to infer the necessary type parameters.

A function call with explicitly specified type parameters looks like: fun::<A, B, ...>().

struct A;          // Concrete type `A`.
struct S(A);       // Concrete type `S`.
struct SGen<T>(T); // Generic type `SGen`.

// The following functions all take ownership of the variable passed into
// them and immediately go out of scope, freeing the variable.

// Define a function `reg_fn` that takes an argument `_s` of type `S`.
// This has no `<T>` so this is not a generic function.
fn reg_fn(_s: S) {}

// Define a function `gen_spec_t` that takes an argument `_s` of type `SGen<T>`.
// It has been explicitly given the type parameter `A`, but because `A` has not 
// been specified as a generic type parameter for `gen_spec_t`, it is not generic.
fn gen_spec_t(_s: SGen<A>) {}

// Define a function `gen_spec_i32` that takes an argument `_s` of type `SGen<i32>`.
// It has been explicitly given the type parameter `i32`, which is a specific type.
// Because `i32` is not a generic type, this function is also not generic.
fn gen_spec_i32(_s: SGen<i32>) {}

// Define a function `generic` that takes an argument `_s` of type `SGen<T>`.
// Because `SGen<T>` is preceded by `<T>`, this function is generic over `T`.
fn generic<T>(_s: SGen<T>) {}

fn main() {
    // Using the non-generic functions
    reg_fn(S(A));          // Concrete type.
    gen_spec_t(SGen(A));   // Implicitly specified type parameter `A`.
    gen_spec_i32(SGen(6)); // Implicitly specified type parameter `i32`.

    // Explicitly specified type parameter `char` to `generic()`.
    generic::<char>(SGen('a'));

    // Implicitly specified type parameter `char` to `generic()`.
    generic(SGen('c'));
}

See also:

functions and structs

Implementation

Similar to functions, implementations require care to remain generic.

#![allow(unused)]
fn main() {
struct S; // Concrete type `S`
struct GenericVal<T>(T); // Generic type `GenericVal`

// impl of GenericVal where we explicitly specify type parameters:
impl GenericVal<f32> {} // Specify `f32`
impl GenericVal<S> {} // Specify `S` as defined above

// `<T>` Must precede the type to remain generic
impl<T> GenericVal<T> {}
}
struct Val {
    val: f64,
}

struct GenVal<T> {
    gen_val: T,
}

// impl of Val
impl Val {
    fn value(&self) -> &f64 {
        &self.val
    }
}

// impl of GenVal for a generic type `T`
impl<T> GenVal<T> {
    fn value(&self) -> &T {
        &self.gen_val
    }
}

fn main() {
    let x = Val { val: 3.0 };
    let y = GenVal { gen_val: 3i32 };

    println!("{}, {}", x.value(), y.value());
}

See also:

functions returning references, impl, and struct

Traits

Of course traits can also be generic. Here we define one which reimplements the Drop trait as a generic method to drop itself and an input.

// Non-copyable types.
struct Empty;
struct Null;

// A trait generic over `T`.
trait DoubleDrop<T> {
    // Define a method on the caller type which takes an
    // additional single parameter `T` and does nothing with it.
    fn double_drop(self, _: T);
}

// Implement `DoubleDrop<T>` for any generic parameter `T` and
// caller `U`.
impl<T, U> DoubleDrop<T> for U {
    // This method takes ownership of both passed arguments,
    // deallocating both.
    fn double_drop(self, _: T) {}
}

fn main() {
    let empty = Empty;
    let null  = Null;

    // Deallocate `empty` and `null`.
    empty.double_drop(null);

    //empty;
    //null;
    // ^ TODO: Try uncommenting these lines.
}

See also:

Drop, struct, and trait

Bounds

When working with generics, the type parameters often must use traits as bounds to stipulate what functionality a type implements. For example, the following example uses the trait Display to print and so it requires T to be bound by Display; that is, T must implement Display.

// Define a function `printer` that takes a generic type `T` which
// must implement trait `Display`.
fn printer<T: Display>(t: T) {
    println!("{}", t);
}

Bounding restricts the generic to types that conform to the bounds. That is:

struct S<T: Display>(T);

// Error! `Vec<T>` does not implement `Display`. This
// specialization will fail.
let s = S(vec![1]);

Another effect of bounding is that generic instances are allowed to access the methods of traits specified in the bounds. For example:

// A trait which implements the print marker: `{:?}`.
use std::fmt::Debug;

trait HasArea {
    fn area(&self) -> f64;
}

impl HasArea for Rectangle {
    fn area(&self) -> f64 { self.length * self.height }
}

#[derive(Debug)]
struct Rectangle { length: f64, height: f64 }
#[allow(dead_code)]
struct Triangle  { length: f64, height: f64 }

// The generic `T` must implement `Debug`. Regardless
// of the type, this will work properly.
fn print_debug<T: Debug>(t: &T) {
    println!("{:?}", t);
}

// `T` must implement `HasArea`. Any type which meets
// the bound can access `HasArea`'s function `area`.
fn area<T: HasArea>(t: &T) -> f64 { t.area() }

fn main() {
    let rectangle = Rectangle { length: 3.0, height: 4.0 };
    let _triangle = Triangle  { length: 3.0, height: 4.0 };

    print_debug(&rectangle);
    println!("Area: {}", area(&rectangle));

    //print_debug(&_triangle);
    //println!("Area: {}", area(&_triangle));
    // ^ TODO: Try uncommenting these.
    // | Error: Does not implement either `Debug` or `HasArea`. 
}

As an additional note, where clauses can also be used to apply bounds in some cases to be more expressive.

See also:

std::fmt, structs, and traits

Testcase: empty bounds

A consequence of how bounds work is that even if a trait doesn't include any functionality, you can still use it as a bound. Eq and Copy are examples of such traits from the std library.

struct Cardinal;
struct BlueJay;
struct Turkey;

trait Red {}
trait Blue {}

impl Red for Cardinal {}
impl Blue for BlueJay {}

// These functions are only valid for types which implement these
// traits. The fact that the traits are empty is irrelevant.
fn red<T: Red>(_: &T)   -> &'static str { "red" }
fn blue<T: Blue>(_: &T) -> &'static str { "blue" }

fn main() {
    let cardinal = Cardinal;
    let blue_jay = BlueJay;
    let _turkey   = Turkey;

    // `red()` won't work on a blue jay nor vice versa
    // because of the bounds.
    println!("A cardinal is {}", red(&cardinal));
    println!("A blue jay is {}", blue(&blue_jay));
    //println!("A turkey is {}", red(&_turkey));
    // ^ TODO: Try uncommenting this line.
}

See also:

std::cmp::Eq, std::marker::Copy, and traits

Multiple bounds

Multiple bounds can be applied with a +. Like normal, different types are separated with ,.

use std::fmt::{Debug, Display};

fn compare_prints<T: Debug + Display>(t: &T) {
    println!("Debug: `{:?}`", t);
    println!("Display: `{}`", t);
}

fn compare_types<T: Debug, U: Debug>(t: &T, u: &U) {
    println!("t: `{:?}`", t);
    println!("u: `{:?}`", u);
}

fn main() {
    let string = "words";
    let array = [1, 2, 3];
    let vec = vec![1, 2, 3];

    compare_prints(&string);
    //compare_prints(&array);
    // TODO ^ Try uncommenting this.

    compare_types(&array, &vec);
}

See also:

std::fmt and traits

Where clauses

A bound can also be expressed using a where clause immediately before the opening {, rather than at the type's first mention. Additionally, where clauses can apply bounds to arbitrary types, rather than just to type parameters.

Some cases that a where clause is useful:

  • When specifying generic types and bounds separately is clearer:
impl <A: TraitB + TraitC, D: TraitE + TraitF> MyTrait<A, D> for YourType {}

// Expressing bounds with a `where` clause
impl <A, D> MyTrait<A, D> for YourType where
    A: TraitB + TraitC,
    D: TraitE + TraitF {}
  • When using a where clause is more expressive than using normal syntax. The impl in this example cannot be directly expressed without a where clause:
use std::fmt::Debug;

trait PrintInOption {
    fn print_in_option(self);
}

// Because we would otherwise have to express this as `T: Debug` or 
// use another method of indirect approach, this requires a `where` clause:
impl<T> PrintInOption for T where
    Option<T>: Debug {
    // We want `Option<T>: Debug` as our bound because that is what's
    // being printed. Doing otherwise would be using the wrong bound.
    fn print_in_option(self) {
        println!("{:?}", Some(self));
    }
}

fn main() {
    let vec = vec![1, 2, 3];

    vec.print_in_option();
}

See also:

RFC, struct, and trait

New Type Idiom

The newtype idiom gives compile time guarantees that the right type of value is supplied to a program.

For example, an age verification function that checks age in years, must be given a value of type Years.

struct Years(i64);

struct Days(i64);

impl Years {
    pub fn to_days(&self) -> Days {
        Days(self.0 * 365)
    }
}


impl Days {
    /// truncates partial years
    pub fn to_years(&self) -> Years {
        Years(self.0 / 365)
    }
}

fn old_enough(age: &Years) -> bool {
    age.0 >= 18
}

fn main() {
    let age = Years(5);
    let age_days = age.to_days();
    println!("Old enough {}", old_enough(&age));
    println!("Old enough {}", old_enough(&age_days.to_years()));
    // println!("Old enough {}", old_enough(&age_days));
}

Uncomment the last print statement to observe that the type supplied must be Years.

To obtain the newtype's value as the base type, you may use tuple syntax like so:

struct Years(i64);

fn main() {
    let years = Years(42);
    let years_as_primitive: i64 = years.0;
}

See also:

structs

Associated items

"Associated Items" refers to a set of rules pertaining to items of various types. It is an extension to trait generics, and allows traits to internally define new items.

One such item is called an associated type, providing simpler usage patterns when the trait is generic over its container type.

See also:

RFC

The Problem

A trait that is generic over its container type has type specification requirements - users of the trait must specify all of its generic types.

In the example below, the Contains trait allows the use of the generic types A and B. The trait is then implemented for the Container type, specifying i32 for A and B so that it can be used with fn difference().

Because Contains is generic, we are forced to explicitly state all of the generic types for fn difference(). In practice, we want a way to express that A and B are determined by the input C. As you will see in the next section, associated types provide exactly that capability.

struct Container(i32, i32);

// A trait which checks if 2 items are stored inside of container.
// Also retrieves first or last value.
trait Contains<A, B> {
    fn contains(&self, _: &A, _: &B) -> bool; // Explicitly requires `A` and `B`.
    fn first(&self) -> i32; // Doesn't explicitly require `A` or `B`.
    fn last(&self) -> i32;  // Doesn't explicitly require `A` or `B`.
}

impl Contains<i32, i32> for Container {
    // True if the numbers stored are equal.
    fn contains(&self, number_1: &i32, number_2: &i32) -> bool {
        (&self.0 == number_1) && (&self.1 == number_2)
    }

    // Grab the first number.
    fn first(&self) -> i32 { self.0 }

    // Grab the last number.
    fn last(&self) -> i32 { self.1 }
}

// `C` contains `A` and `B`. In light of that, having to express `A` and
// `B` again is a nuisance.
fn difference<A, B, C>(container: &C) -> i32 where
    C: Contains<A, B> {
    container.last() - container.first()
}

fn main() {
    let number_1 = 3;
    let number_2 = 10;

    let container = Container(number_1, number_2);

    println!("Does container contain {} and {}: {}",
        &number_1, &number_2,
        container.contains(&number_1, &number_2));
    println!("First number: {}", container.first());
    println!("Last number: {}", container.last());

    println!("The difference is: {}", difference(&container));
}

See also:

structs, and traits

Associated types

The use of "Associated types" improves the overall readability of code by moving inner types locally into a trait as output types. Syntax for the trait definition is as follows:

#![allow(unused)]
fn main() {
// `A` and `B` are defined in the trait via the `type` keyword.
// (Note: `type` in this context is different from `type` when used for
// aliases).
trait Contains {
    type A;
    type B;

    // Updated syntax to refer to these new types generically.
    fn contains(&self, &Self::A, &Self::B) -> bool;
}
}

Note that functions that use the trait Contains are no longer required to express A or B at all:

// Without using associated types
fn difference<A, B, C>(container: &C) -> i32 where
    C: Contains<A, B> { ... }

// Using associated types
fn difference<C: Contains>(container: &C) -> i32 { ... }

Let's rewrite the example from the previous section using associated types:

struct Container(i32, i32);

// A trait which checks if 2 items are stored inside of container.
// Also retrieves first or last value.
trait Contains {
    // Define generic types here which methods will be able to utilize.
    type A;
    type B;

    fn contains(&self, _: &Self::A, _: &Self::B) -> bool;
    fn first(&self) -> i32;
    fn last(&self) -> i32;
}

impl Contains for Container {
    // Specify what types `A` and `B` are. If the `input` type
    // is `Container(i32, i32)`, the `output` types are determined
    // as `i32` and `i32`.
    type A = i32;
    type B = i32;

    // `&Self::A` and `&Self::B` are also valid here.
    fn contains(&self, number_1: &i32, number_2: &i32) -> bool {
        (&self.0 == number_1) && (&self.1 == number_2)
    }
    // Grab the first number.
    fn first(&self) -> i32 { self.0 }

    // Grab the last number.
    fn last(&self) -> i32 { self.1 }
}

fn difference<C: Contains>(container: &C) -> i32 {
    container.last() - container.first()
}

fn main() {
    let number_1 = 3;
    let number_2 = 10;

    let container = Container(number_1, number_2);

    println!("Does container contain {} and {}: {}",
        &number_1, &number_2,
        container.contains(&number_1, &number_2));
    println!("First number: {}", container.first());
    println!("Last number: {}", container.last());
    
    println!("The difference is: {}", difference(&container));
}

Phantom type parameters

A phantom type parameter is one that doesn't show up at runtime, but is checked statically (and only) at compile time.

Data types can use extra generic type parameters to act as markers or to perform type checking at compile time. These extra parameters hold no storage values, and have no runtime behavior.

In the following example, we combine std::marker::PhantomData with the phantom type parameter concept to create tuples containing different data types.

use std::marker::PhantomData;

// A phantom tuple struct which is generic over `A` with hidden parameter `B`.
#[derive(PartialEq)] // Allow equality test for this type.
struct PhantomTuple<A, B>(A,PhantomData<B>);

// A phantom type struct which is generic over `A` with hidden parameter `B`.
#[derive(PartialEq)] // Allow equality test for this type.
struct PhantomStruct<A, B> { first: A, phantom: PhantomData<B> }

// Note: Storage is allocated for generic type `A`, but not for `B`.
//       Therefore, `B` cannot be used in computations.

fn main() {
    // Here, `f32` and `f64` are the hidden parameters.
    // PhantomTuple type specified as `<char, f32>`.
    let _tuple1: PhantomTuple<char, f32> = PhantomTuple('Q', PhantomData);
    // PhantomTuple type specified as `<char, f64>`.
    let _tuple2: PhantomTuple<char, f64> = PhantomTuple('Q', PhantomData);

    // Type specified as `<char, f32>`.
    let _struct1: PhantomStruct<char, f32> = PhantomStruct {
        first: 'Q',
        phantom: PhantomData,
    };
    // Type specified as `<char, f64>`.
    let _struct2: PhantomStruct<char, f64> = PhantomStruct {
        first: 'Q',
        phantom: PhantomData,
    };
    
    // Compile-time Error! Type mismatch so these cannot be compared:
    //println!("_tuple1 == _tuple2 yields: {}",
    //          _tuple1 == _tuple2);
    
    // Compile-time Error! Type mismatch so these cannot be compared:
    //println!("_struct1 == _struct2 yields: {}",
    //          _struct1 == _struct2);
}

See also:

Derive, struct, and TupleStructs

Testcase: unit clarification

A useful method of unit conversions can be examined by implementing Add with a phantom type parameter. The Add trait is examined below:

// This construction would impose: `Self + RHS = Output`
// where RHS defaults to Self if not specified in the implementation.
pub trait Add<RHS = Self> {
    type Output;

    fn add(self, rhs: RHS) -> Self::Output;
}

// `Output` must be `T<U>` so that `T<U> + T<U> = T<U>`.
impl<U> Add for T<U> {
    type Output = T<U>;
    ...
}

The whole implementation:

use std::ops::Add;
use std::marker::PhantomData;

/// Create void enumerations to define unit types.
#[derive(Debug, Clone, Copy)]
enum Inch {}
#[derive(Debug, Clone, Copy)]
enum Mm {}

/// `Length` is a type with phantom type parameter `Unit`,
/// and is not generic over the length type (that is `f64`).
///
/// `f64` already implements the `Clone` and `Copy` traits.
#[derive(Debug, Clone, Copy)]
struct Length<Unit>(f64, PhantomData<Unit>);

/// The `Add` trait defines the behavior of the `+` operator.
impl<Unit> Add for Length<Unit> {
     type Output = Length<Unit>;

    // add() returns a new `Length` struct containing the sum.
    fn add(self, rhs: Length<Unit>) -> Length<Unit> {
        // `+` calls the `Add` implementation for `f64`.
        Length(self.0 + rhs.0, PhantomData)
    }
}

fn main() {
    // Specifies `one_foot` to have phantom type parameter `Inch`.
    let one_foot:  Length<Inch> = Length(12.0, PhantomData);
    // `one_meter` has phantom type parameter `Mm`.
    let one_meter: Length<Mm>   = Length(1000.0, PhantomData);

    // `+` calls the `add()` method we implemented for `Length<Unit>`.
    //
    // Since `Length` implements `Copy`, `add()` does not consume
    // `one_foot` and `one_meter` but copies them into `self` and `rhs`.
    let two_feet = one_foot + one_foot;
    let two_meters = one_meter + one_meter;

    // Addition works.
    println!("one foot + one_foot = {:?} in", two_feet.0);
    println!("one meter + one_meter = {:?} mm", two_meters.0);

    // Nonsensical operations fail as they should:
    // Compile-time Error: type mismatch.
    //let one_feter = one_foot + one_meter;
}

See also:

Borrowing (&), Bounds (X: Y), enum, impl & self, Overloading, ref, Traits (X for Y), and TupleStructs.

Scoping rules

Scopes play an important part in ownership, borrowing, and lifetimes. That is, they indicate to the compiler when borrows are valid, when resources can be freed, and when variables are created or destroyed.

RAII

Variables in Rust do more than just hold data in the stack: they also own resources, e.g. Box<T> owns memory in the heap. Rust enforces RAII (Resource Acquisition Is Initialization), so whenever an object goes out of scope, its destructor is called and its owned resources are freed.

This behavior shields against resource leak bugs, so you'll never have to manually free memory or worry about memory leaks again! Here's a quick showcase:

// raii.rs
fn create_box() {
    // Allocate an integer on the heap
    let _box1 = Box::new(3i32);

    // `_box1` is destroyed here, and memory gets freed
}

fn main() {
    // Allocate an integer on the heap
    let _box2 = Box::new(5i32);

    // A nested scope:
    {
        // Allocate an integer on the heap
        let _box3 = Box::new(4i32);

        // `_box3` is destroyed here, and memory gets freed
    }

    // Creating lots of boxes just for fun
    // There's no need to manually free memory!
    for _ in 0u32..1_000 {
        create_box();
    }

    // `_box2` is destroyed here, and memory gets freed
}

Of course, we can double check for memory errors using valgrind:

$ rustc raii.rs && valgrind ./raii
==26873== Memcheck, a memory error detector
==26873== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==26873== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==26873== Command: ./raii
==26873==
==26873==
==26873== HEAP SUMMARY:
==26873==     in use at exit: 0 bytes in 0 blocks
==26873==   total heap usage: 1,013 allocs, 1,013 frees, 8,696 bytes allocated
==26873==
==26873== All heap blocks were freed -- no leaks are possible
==26873==
==26873== For counts of detected and suppressed errors, rerun with: -v
==26873== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

No leaks here!

Destructor

The notion of a destructor in Rust is provided through the Drop trait. The destructor is called when the resource goes out of scope. This trait is not required to be implemented for every type, only implement it for your type if you require its own destructor logic.

Run the below example to see how the Drop trait works. When the variable in the main function goes out of scope the custom destructor will be invoked.

struct ToDrop;

impl Drop for ToDrop {
    fn drop(&mut self) {
        println!("ToDrop is being dropped");
    }
}

fn main() {
    let x = ToDrop;
    println!("Made a ToDrop!");
}

See also:

Box

Ownership and moves

Because variables are in charge of freeing their own resources, resources can only have one owner. This also prevents resources from being freed more than once. Note that not all variables own resources (e.g. references).

When doing assignments (let x = y) or passing function arguments by value (foo(x)), the ownership of the resources is transferred. In Rust-speak, this is known as a move.

After moving resources, the previous owner can no longer be used. This avoids creating dangling pointers.

// This function takes ownership of the heap allocated memory
fn destroy_box(c: Box<i32>) {
    println!("Destroying a box that contains {}", c);

    // `c` is destroyed and the memory freed
}

fn main() {
    // _Stack_ allocated integer
    let x = 5u32;

    // *Copy* `x` into `y` - no resources are moved
    let y = x;

    // Both values can be independently used
    println!("x is {}, and y is {}", x, y);

    // `a` is a pointer to a _heap_ allocated integer
    let a = Box::new(5i32);

    println!("a contains: {}", a);

    // *Move* `a` into `b`
    let b = a;
    // The pointer address of `a` is copied (not the data) into `b`.
    // Both are now pointers to the same heap allocated data, but
    // `b` now owns it.
    
    // Error! `a` can no longer access the data, because it no longer owns the
    // heap memory
    //println!("a contains: {}", a);
    // TODO ^ Try uncommenting this line

    // This function takes ownership of the heap allocated memory from `b`
    destroy_box(b);

    // Since the heap memory has been freed at this point, this action would
    // result in dereferencing freed memory, but it's forbidden by the compiler
    // Error! Same reason as the previous Error
    //println!("b contains: {}", b);
    // TODO ^ Try uncommenting this line
}

Mutability

Mutability of data can be changed when ownership is transferred.

fn main() {
    let immutable_box = Box::new(5u32);

    println!("immutable_box contains {}", immutable_box);

    // Mutability error
    //*immutable_box = 4;

    // *Move* the box, changing the ownership (and mutability)
    let mut mutable_box = immutable_box;

    println!("mutable_box contains {}", mutable_box);

    // Modify the contents of the box
    *mutable_box = 4;

    println!("mutable_box now contains {}", mutable_box);
}

Partial moves

Pattern bindings can have by-move and by-reference bindings at the same time which is used in destructuring. Using these pattern will result in partial move for the variable, which means that part of the variable is moved while other parts stayed. In this case, the parent variable cannot be used afterwards as a whole. However, parts of it that are referenced and not moved can be used.

fn main() {
    #[derive(Debug)]
    struct Person {
        name: String,
        age: u8,
    }

    let person = Person {
        name: String::from("Alice"),
        age: 20,
    };

    // `name` is moved out of person, but `age` is referenced
    let Person { name, ref age } = person;

    println!("The person's age is {}", age);

    println!("The person's name is {}", name);

    // Error! borrow of partially moved value: `person` partial move occurs
    //println!("The person struct is {:?}", person);

    // `person` cannot be used but `person.age` can be used as it is not moved
    println!("The person's age from person struct is {}", person.age);
}

See also:

destructuring

Borrowing

Most of the time, we'd like to access data without taking ownership over it. To accomplish this, Rust uses a borrowing mechanism. Instead of passing objects by value (T), objects can be passed by reference (&T).

The compiler statically guarantees (via its borrow checker) that references always point to valid objects. That is, while references to an object exist, the object cannot be destroyed.

// This function takes ownership of a box and destroys it
fn eat_box_i32(boxed_i32: Box<i32>) {
    println!("Destroying box that contains {}", boxed_i32);
}

// This function borrows an i32
fn borrow_i32(borrowed_i32: &i32) {
    println!("This int is: {}", borrowed_i32);
}

fn main() {
    // Create a boxed i32, and a stacked i32
    let boxed_i32 = Box::new(5_i32);
    let stacked_i32 = 6_i32;

    // Borrow the contents of the box. Ownership is not taken,
    // so the contents can be borrowed again.
    borrow_i32(&boxed_i32);
    borrow_i32(&stacked_i32);

    {
        // Take a reference to the data contained inside the box
        let _ref_to_i32: &i32 = &boxed_i32;

        // Error!
        // Can't destroy `boxed_i32` while the inner value is borrowed later in scope.
        eat_box_i32(boxed_i32);
        // FIXME ^ Comment out this line

        // Attempt to borrow `_ref_to_i32` after inner value is destroyed
        borrow_i32(_ref_to_i32);
        // `_ref_to_i32` goes out of scope and is no longer borrowed.
    }

    // `boxed_i32` can now give up ownership to `eat_box` and be destroyed
    eat_box_i32(boxed_i32);
}

Mutability

Mutable data can be mutably borrowed using &mut T. This is called a mutable reference and gives read/write access to the borrower. In contrast, &T borrows the data via an immutable reference, and the borrower can read the data but not modify it:

#[allow(dead_code)]
#[derive(Clone, Copy)]
struct Book {
    // `&'static str` is a reference to a string allocated in read only memory
    author: &'static str,
    title: &'static str,
    year: u32,
}

// This function takes a reference to a book
fn borrow_book(book: &Book) {
    println!("I immutably borrowed {} - {} edition", book.title, book.year);
}

// This function takes a reference to a mutable book and changes `year` to 2014
fn new_edition(book: &mut Book) {
    book.year = 2014;
    println!("I mutably borrowed {} - {} edition", book.title, book.year);
}

fn main() {
    // Create an immutable Book named `immutabook`
    let immutabook = Book {
        // string literals have type `&'static str`
        author: "Douglas Hofstadter",
        title: "Gödel, Escher, Bach",
        year: 1979,
    };

    // Create a mutable copy of `immutabook` and call it `mutabook`
    let mut mutabook = immutabook;
    
    // Immutably borrow an immutable object
    borrow_book(&immutabook);

    // Immutably borrow a mutable object
    borrow_book(&mutabook);
    
    // Borrow a mutable object as mutable
    new_edition(&mut mutabook);
    
    // Error! Cannot borrow an immutable object as mutable
    new_edition(&mut immutabook);
    // FIXME ^ Comment out this line
}

See also:

static

Aliasing

Data can be immutably borrowed any number of times, but while immutably borrowed, the original data can't be mutably borrowed. On the other hand, only one mutable borrow is allowed at a time. The original data can be borrowed again only after the mutable reference has been used for the last time.

struct Point { x: i32, y: i32, z: i32 }

fn main() {
    let mut point = Point { x: 0, y: 0, z: 0 };

    let borrowed_point = &point;
    let another_borrow = &point;

    // Data can be accessed via the references and the original owner
    println!("Point has coordinates: ({}, {}, {})",
                borrowed_point.x, another_borrow.y, point.z);

    // Error! Can't borrow `point` as mutable because it's currently
    // borrowed as immutable.
    // let mutable_borrow = &mut point;
    // TODO ^ Try uncommenting this line

    // The borrowed values are used again here
    println!("Point has coordinates: ({}, {}, {})",
                borrowed_point.x, another_borrow.y, point.z);

    // The immutable references are no longer used for the rest of the code so
    // it is possible to reborrow with a mutable reference.
    let mutable_borrow = &mut point;

    // Change data via mutable reference
    mutable_borrow.x = 5;
    mutable_borrow.y = 2;
    mutable_borrow.z = 1;

    // Error! Can't borrow `point` as immutable because it's currently
    // borrowed as mutable.
    // let y = &point.y;
    // TODO ^ Try uncommenting this line

    // Error! Can't print because `println!` takes an immutable reference.
    // println!("Point Z coordinate is {}", point.z);
    // TODO ^ Try uncommenting this line

    // Ok! Mutable references can be passed as immutable to `println!`
    println!("Point has coordinates: ({}, {}, {})",
                mutable_borrow.x, mutable_borrow.y, mutable_borrow.z);

    // The mutable reference is no longer used for the rest of the code so it
    // is possible to reborrow
    let new_borrowed_point = &point;
    println!("Point now has coordinates: ({}, {}, {})",
             new_borrowed_point.x, new_borrowed_point.y, new_borrowed_point.z);
}

ref 패턴

let 바인딩을 통해 패턴 매칭이나 구조 해체를 할 때, ref 키워드를 이용해 구조체나 튜플의 필드에 대한 참조를 가져올 수 있습니다. 아래의 예시는 이가 유용할 수 있는 몇몇 상황을 들었습니다:

#[derive(Clone, Copy)]
struct Point { x: i32, y: i32 }

fn main() {
    let c = 'Q';

    // A `ref` borrow on the left side of an assignment is equivalent to
    // an `&` borrow on the right side.
    let ref ref_c1 = c;
    let ref_c2 = &c;

    println!("ref_c1 equals ref_c2: {}", *ref_c1 == *ref_c2);

    let point = Point { x: 0, y: 0 };

    // `ref` is also valid when destructuring a struct.
    let _copy_of_x = {
        // `ref_to_x` is a reference to the `x` field of `point`.
        let Point { x: ref ref_to_x, y: _ } = point;

        // Return a copy of the `x` field of `point`.
        *ref_to_x
    };

    // A mutable copy of `point`
    let mut mutable_point = point;

    {
        // `ref` can be paired with `mut` to take mutable references.
        let Point { x: _, y: ref mut mut_ref_to_y } = mutable_point;

        // Mutate the `y` field of `mutable_point` via a mutable reference.
        *mut_ref_to_y = 1;
    }

    println!("point is ({}, {})", point.x, point.y);
    println!("mutable_point is ({}, {})", mutable_point.x, mutable_point.y);

    // A mutable tuple that includes a pointer
    let mut mutable_tuple = (Box::new(5u32), 3u32);
    
    {
        // Destructure `mutable_tuple` to change the value of `last`.
        let (_, ref mut last) = mutable_tuple;
        *last = 2u32;
    }
    
    println!("tuple is {:?}", mutable_tuple);
}

Lifetimes

A lifetime is a construct the compiler (or more specifically, its borrow checker) uses to ensure all borrows are valid. Specifically, a variable's lifetime begins when it is created and ends when it is destroyed. While lifetimes and scopes are often referred to together, they are not the same.

Take, for example, the case where we borrow a variable via &. The borrow has a lifetime that is determined by where it is declared. As a result, the borrow is valid as long as it ends before the lender is destroyed. However, the scope of the borrow is determined by where the reference is used.

In the following example and in the rest of this section, we will see how lifetimes relate to scopes, as well as how the two differ.

// Lifetimes are annotated below with lines denoting the creation
// and destruction of each variable.
// `i` has the longest lifetime because its scope entirely encloses 
// both `borrow1` and `borrow2`. The duration of `borrow1` compared 
// to `borrow2` is irrelevant since they are disjoint.
fn main() {
    let i = 3; // Lifetime for `i` starts. ────────────────┐
    //                                                     │
    { //                                                   │
        let borrow1 = &i; // `borrow1` lifetime starts. ──┐│
        //                                                ││
        println!("borrow1: {}", borrow1); //              ││
    } // `borrow1 ends. ──────────────────────────────────┘│
    //                                                     │
    //                                                     │
    { //                                                   │
        let borrow2 = &i; // `borrow2` lifetime starts. ──┐│
        //                                                ││
        println!("borrow2: {}", borrow2); //              ││
    } // `borrow2` ends. ─────────────────────────────────┘│
    //                                                     │
}   // Lifetime ends. ─────────────────────────────────────┘

Note that no names or types are assigned to label lifetimes. This restricts how lifetimes will be able to be used as we will see.

Explicit annotation

The borrow checker uses explicit lifetime annotations to determine how long references should be valid. In cases where lifetimes are not elided1, Rust requires explicit annotations to determine what the lifetime of a reference should be. The syntax for explicitly annotating a lifetime uses an apostrophe character as follows:

foo<'a>
// `foo` has a lifetime parameter `'a`

Similar to closures, using lifetimes requires generics. Additionally, this lifetime syntax indicates that the lifetime of foo may not exceed that of 'a. Explicit annotation of a type has the form &'a T where 'a has already been introduced.

In cases with multiple lifetimes, the syntax is similar:

foo<'a, 'b>
// `foo` has lifetime parameters `'a` and `'b`

In this case, the lifetime of foo cannot exceed that of either 'a or 'b.

See the following example for explicit lifetime annotation in use:

// `print_refs` takes two references to `i32` which have different
// lifetimes `'a` and `'b`. These two lifetimes must both be at
// least as long as the function `print_refs`.
fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) {
    println!("x is {} and y is {}", x, y);
}

// A function which takes no arguments, but has a lifetime parameter `'a`.
fn failed_borrow<'a>() {
    let _x = 12;

    // ERROR: `_x` does not live long enough
    //let y: &'a i32 = &_x;
    // Attempting to use the lifetime `'a` as an explicit type annotation 
    // inside the function will fail because the lifetime of `&_x` is shorter
    // than that of `y`. A short lifetime cannot be coerced into a longer one.
}

fn main() {
    // Create variables to be borrowed below.
    let (four, nine) = (4, 9);
    
    // Borrows (`&`) of both variables are passed into the function.
    print_refs(&four, &nine);
    // Any input which is borrowed must outlive the borrower. 
    // In other words, the lifetime of `four` and `nine` must 
    // be longer than that of `print_refs`.
    
    failed_borrow();
    // `failed_borrow` contains no references to force `'a` to be 
    // longer than the lifetime of the function, but `'a` is longer.
    // Because the lifetime is never constrained, it defaults to `'static`.
}

See also:

generics and closures


  1. elision implicitly annotates lifetimes and so is different.

Functions

Ignoring elision, function signatures with lifetimes have a few constraints:

  • any reference must have an annotated lifetime.
  • any reference being returned must have the same lifetime as an input or be static.

Additionally, note that returning references without input is banned if it would result in returning references to invalid data. The following example shows off some valid forms of functions with lifetimes:

// One input reference with lifetime `'a` which must live
// at least as long as the function.
fn print_one<'a>(x: &'a i32) {
    println!("`print_one`: x is {}", x);
}

// Mutable references are possible with lifetimes as well.
fn add_one<'a>(x: &'a mut i32) {
    *x += 1;
}

// Multiple elements with different lifetimes. In this case, it
// would be fine for both to have the same lifetime `'a`, but
// in more complex cases, different lifetimes may be required.
fn print_multi<'a, 'b>(x: &'a i32, y: &'b i32) {
    println!("`print_multi`: x is {}, y is {}", x, y);
}

// Returning references that have been passed in is acceptable.
// However, the correct lifetime must be returned.
fn pass_x<'a, 'b>(x: &'a i32, _: &'b i32) -> &'a i32 { x }

//fn invalid_output<'a>() -> &'a String { &String::from("foo") }
// The above is invalid: `'a` must live longer than the function.
// Here, `&String::from("foo")` would create a `String`, followed by a
// reference. Then the data is dropped upon exiting the scope, leaving
// a reference to invalid data to be returned.

fn main() {
    let x = 7;
    let y = 9;
    
    print_one(&x);
    print_multi(&x, &y);
    
    let z = pass_x(&x, &y);
    print_one(z);

    let mut t = 3;
    add_one(&mut t);
    print_one(&t);
}

See also:

functions

Methods

Methods are annotated similarly to functions:

struct Owner(i32);

impl Owner {
    // Annotate lifetimes as in a standalone function.
    fn add_one<'a>(&'a mut self) { self.0 += 1; }
    fn print<'a>(&'a self) {
        println!("`print`: {}", self.0);
    }
}

fn main() {
    let mut owner = Owner(18);

    owner.add_one();
    owner.print();
}

See also:

methods

Structs

Annotation of lifetimes in structures are also similar to functions:

// A type `Borrowed` which houses a reference to an
// `i32`. The reference to `i32` must outlive `Borrowed`.
#[derive(Debug)]
struct Borrowed<'a>(&'a i32);

// Similarly, both references here must outlive this structure.
#[derive(Debug)]
struct NamedBorrowed<'a> {
    x: &'a i32,
    y: &'a i32,
}

// An enum which is either an `i32` or a reference to one.
#[derive(Debug)]
enum Either<'a> {
    Num(i32),
    Ref(&'a i32),
}

fn main() {
    let x = 18;
    let y = 15;

    let single = Borrowed(&x);
    let double = NamedBorrowed { x: &x, y: &y };
    let reference = Either::Ref(&x);
    let number    = Either::Num(y);

    println!("x is borrowed in {:?}", single);
    println!("x and y are borrowed in {:?}", double);
    println!("x is borrowed in {:?}", reference);
    println!("y is *not* borrowed in {:?}", number);
}

See also:

structs

Traits

Annotation of lifetimes in trait methods basically are similar to functions. Note that impl may have annotation of lifetimes too.

// A struct with annotation of lifetimes.
#[derive(Debug)]
 struct Borrowed<'a> {
     x: &'a i32,
 }

// Annotate lifetimes to impl.
impl<'a> Default for Borrowed<'a> {
    fn default() -> Self {
        Self {
            x: &10,
        }
    }
}

fn main() {
    let b: Borrowed = Default::default();
    println!("b is {:?}", b);
}

See also:

traits

Bounds

Just like generic types can be bounded, lifetimes (themselves generic) use bounds as well. The : character has a slightly different meaning here, but + is the same. Note how the following read:

  1. T: 'a: All references in T must outlive lifetime 'a.
  2. T: Trait + 'a: Type T must implement trait Trait and all references in T must outlive 'a.

The example below shows the above syntax in action used after keyword where:

use std::fmt::Debug; // Trait to bound with.

#[derive(Debug)]
struct Ref<'a, T: 'a>(&'a T);
// `Ref` contains a reference to a generic type `T` that has
// an unknown lifetime `'a`. `T` is bounded such that any
// *references* in `T` must outlive `'a`. Additionally, the lifetime
// of `Ref` may not exceed `'a`.

// A generic function which prints using the `Debug` trait.
fn print<T>(t: T) where
    T: Debug {
    println!("`print`: t is {:?}", t);
}

// Here a reference to `T` is taken where `T` implements
// `Debug` and all *references* in `T` outlive `'a`. In
// addition, `'a` must outlive the function.
fn print_ref<'a, T>(t: &'a T) where
    T: Debug + 'a {
    println!("`print_ref`: t is {:?}", t);
}

fn main() {
    let x = 7;
    let ref_x = Ref(&x);

    print_ref(&ref_x);
    print(ref_x);
}

See also:

generics, bounds in generics, and multiple bounds in generics

Coercion

A longer lifetime can be coerced into a shorter one so that it works inside a scope it normally wouldn't work in. This comes in the form of inferred coercion by the Rust compiler, and also in the form of declaring a lifetime difference:

// Here, Rust infers a lifetime that is as short as possible.
// The two references are then coerced to that lifetime.
fn multiply<'a>(first: &'a i32, second: &'a i32) -> i32 {
    first * second
}

// `<'a: 'b, 'b>` reads as lifetime `'a` is at least as long as `'b`.
// Here, we take in an `&'a i32` and return a `&'b i32` as a result of coercion.
fn choose_first<'a: 'b, 'b>(first: &'a i32, _: &'b i32) -> &'b i32 {
    first
}

fn main() {
    let first = 2; // Longer lifetime
    
    {
        let second = 3; // Shorter lifetime
        
        println!("The product is {}", multiply(&first, &second));
        println!("{} is the first", choose_first(&first, &second));
    };
}

Static

Rust has a few reserved lifetime names. One of those is 'static. You might encounter it in two situations:

// A reference with 'static lifetime:
let s: &'static str = "hello world";

// 'static as part of a trait bound:
fn generic<T>(x: T) where T: 'static {}

Both are related but subtly different and this is a common source for confusion when learning Rust. Here are some examples for each situation:

Reference lifetime

As a reference lifetime 'static indicates that the data pointed to by the reference lives for the entire lifetime of the running program. It can still be coerced to a shorter lifetime.

There are two ways to make a variable with 'static lifetime, and both are stored in the read-only memory of the binary:

  • Make a constant with the static declaration.
  • Make a string literal which has type: &'static str.

See the following example for a display of each method:

// Make a constant with `'static` lifetime.
static NUM: i32 = 18;

// Returns a reference to `NUM` where its `'static`
// lifetime is coerced to that of the input argument.
fn coerce_static<'a>(_: &'a i32) -> &'a i32 {
    &NUM
}

fn main() {
    {
        // Make a `string` literal and print it:
        let static_string = "I'm in read-only memory";
        println!("static_string: {}", static_string);

        // When `static_string` goes out of scope, the reference
        // can no longer be used, but the data remains in the binary.
    }

    {
        // Make an integer to use for `coerce_static`:
        let lifetime_num = 9;

        // Coerce `NUM` to lifetime of `lifetime_num`:
        let coerced_static = coerce_static(&lifetime_num);

        println!("coerced_static: {}", coerced_static);
    }

    println!("NUM: {} stays accessible!", NUM);
}

Trait bound

As a trait bound, it means the type does not contain any non-static references. Eg. the receiver can hold on to the type for as long as they want and it will never become invalid until they drop it.

It's important to understand this means that any owned data always passes a 'static lifetime bound, but a reference to that owned data generally does not:

use std::fmt::Debug;

fn print_it( input: impl Debug + 'static )
{
    println!( "'static value passed in is: {:?}", input );
}

fn use_it()
{
    // i is owned and contains no references, thus it's 'static:
    let i = 5;
    print_it(i);

    // oops, &i only has the lifetime defined by the scope of
    // use_it(), so it's not 'static:
    print_it(&i);
}

The compiler will tell you:

error[E0597]: `i` does not live long enough
  --> src/lib.rs:15:15
   |
15 |     print_it(&i);
   |     ---------^^--
   |     |         |
   |     |         borrowed value does not live long enough
   |     argument requires that `i` is borrowed for `'static`
16 | }
   | - `i` dropped here while still borrowed

See also:

'static constants

Elision

Some lifetime patterns are overwhelmingly common and so the borrow checker will allow you to omit them to save typing and to improve readability. This is known as elision. Elision exists in Rust solely because these patterns are common.

The following code shows a few examples of elision. For a more comprehensive description of elision, see lifetime elision in the book.

// `elided_input` and `annotated_input` essentially have identical signatures
// because the lifetime of `elided_input` is inferred by the compiler:
fn elided_input(x: &i32) {
    println!("`elided_input`: {}", x);
}

fn annotated_input<'a>(x: &'a i32) {
    println!("`annotated_input`: {}", x);
}

// Similarly, `elided_pass` and `annotated_pass` have identical signatures
// because the lifetime is added implicitly to `elided_pass`:
fn elided_pass(x: &i32) -> &i32 { x }

fn annotated_pass<'a>(x: &'a i32) -> &'a i32 { x }

fn main() {
    let x = 3;

    elided_input(&x);
    annotated_input(&x);

    println!("`elided_pass`: {}", elided_pass(&x));
    println!("`annotated_pass`: {}", annotated_pass(&x));
}

See also:

elision

Traits

A trait is a collection of methods defined for an unknown type: Self. They can access other methods declared in the same trait.

Traits can be implemented for any data type. In the example below, we define Animal, a group of methods. The Animal trait is then implemented for the Sheep data type, allowing the use of methods from Animal with a Sheep.

struct Sheep { naked: bool, name: &'static str }

trait Animal {
    // Static method signature; `Self` refers to the implementor type.
    fn new(name: &'static str) -> Self;

    // Instance method signatures; these will return a string.
    fn name(&self) -> &'static str;
    fn noise(&self) -> &'static str;

    // Traits can provide default method definitions.
    fn talk(&self) {
        println!("{} says {}", self.name(), self.noise());
    }
}

impl Sheep {
    fn is_naked(&self) -> bool {
        self.naked
    }

    fn shear(&mut self) {
        if self.is_naked() {
            // Implementor methods can use the implementor's trait methods.
            println!("{} is already naked...", self.name());
        } else {
            println!("{} gets a haircut!", self.name);

            self.naked = true;
        }
    }
}

// Implement the `Animal` trait for `Sheep`.
impl Animal for Sheep {
    // `Self` is the implementor type: `Sheep`.
    fn new(name: &'static str) -> Sheep {
        Sheep { name: name, naked: false }
    }

    fn name(&self) -> &'static str {
        self.name
    }

    fn noise(&self) -> &'static str {
        if self.is_naked() {
            "baaaaah?"
        } else {
            "baaaaah!"
        }
    }
    
    // Default trait methods can be overridden.
    fn talk(&self) {
        // For example, we can add some quiet contemplation.
        println!("{} pauses briefly... {}", self.name, self.noise());
    }
}

fn main() {
    // Type annotation is necessary in this case.
    let mut dolly: Sheep = Animal::new("Dolly");
    // TODO ^ Try removing the type annotations.

    dolly.talk();
    dolly.shear();
    dolly.talk();
}

Derive

The compiler is capable of providing basic implementations for some traits via the #[derive] attribute. These traits can still be manually implemented if a more complex behavior is required.

The following is a list of derivable traits:

  • Comparison traits: Eq, PartialEq, Ord, PartialOrd.
  • Clone, to create T from &T via a copy.
  • Copy, to give a type 'copy semantics' instead of 'move semantics'.
  • Hash, to compute a hash from &T.
  • Default, to create an empty instance of a data type.
  • Debug, to format a value using the {:?} formatter.
// `Centimeters`, a tuple struct that can be compared
#[derive(PartialEq, PartialOrd)]
struct Centimeters(f64);

// `Inches`, a tuple struct that can be printed
#[derive(Debug)]
struct Inches(i32);

impl Inches {
    fn to_centimeters(&self) -> Centimeters {
        let &Inches(inches) = self;

        Centimeters(inches as f64 * 2.54)
    }
}

// `Seconds`, a tuple struct with no additional attributes
struct Seconds(i32);

fn main() {
    let _one_second = Seconds(1);

    // Error: `Seconds` can't be printed; it doesn't implement the `Debug` trait
    //println!("One second looks like: {:?}", _one_second);
    // TODO ^ Try uncommenting this line

    // Error: `Seconds` can't be compared; it doesn't implement the `PartialEq` trait
    //let _this_is_true = (_one_second == _one_second);
    // TODO ^ Try uncommenting this line

    let foot = Inches(12);

    println!("One foot equals {:?}", foot);

    let meter = Centimeters(100.0);

    let cmp =
        if foot.to_centimeters() < meter {
            "smaller"
        } else {
            "bigger"
        };

    println!("One foot is {} than one meter.", cmp);
}

See also:

derive

Returning Traits with dyn

The Rust compiler needs to know how much space every function's return type requires. This means all your functions have to return a concrete type. Unlike other languages, if you have a trait like Animal, you can't write a function that returns Animal, because its different implementations will need different amounts of memory.

However, there's an easy workaround. Instead of returning a trait object directly, our functions return a Box which contains some Animal. A box is just a reference to some memory in the heap. Because a reference has a statically-known size, and the compiler can guarantee it points to a heap-allocated Animal, we can return a trait from our function!

Rust tries to be as explicit as possible whenever it allocates memory on the heap. So if your function returns a pointer-to-trait-on-heap in this way, you need to write the return type with the dyn keyword, e.g. Box<dyn Animal>.

struct Sheep {}
struct Cow {}

trait Animal {
    // Instance method signature
    fn noise(&self) -> &'static str;
}

// Implement the `Animal` trait for `Sheep`.
impl Animal for Sheep {
    fn noise(&self) -> &'static str {
        "baaaaah!"
    }
}

// Implement the `Animal` trait for `Cow`.
impl Animal for Cow {
    fn noise(&self) -> &'static str {
        "moooooo!"
    }
}

// Returns some struct that implements Animal, but we don't know which one at compile time.
fn random_animal(random_number: f64) -> Box<dyn Animal> {
    if random_number < 0.5 {
        Box::new(Sheep {})
    } else {
        Box::new(Cow {})
    }
}

fn main() {
    let random_number = 0.234;
    let animal = random_animal(random_number);
    println!("You've randomly chosen an animal, and it says {}", animal.noise());
}

Operator Overloading

In Rust, many of the operators can be overloaded via traits. That is, some operators can be used to accomplish different tasks based on their input arguments. This is possible because operators are syntactic sugar for method calls. For example, the + operator in a + b calls the add method (as in a.add(b)). This add method is part of the Add trait. Hence, the + operator can be used by any implementor of the Add trait.

A list of the traits, such as Add, that overload operators can be found in core::ops.

use std::ops;

struct Foo;
struct Bar;

#[derive(Debug)]
struct FooBar;

#[derive(Debug)]
struct BarFoo;

// The `std::ops::Add` trait is used to specify the functionality of `+`.
// Here, we make `Add<Bar>` - the trait for addition with a RHS of type `Bar`.
// The following block implements the operation: Foo + Bar = FooBar
impl ops::Add<Bar> for Foo {
    type Output = FooBar;

    fn add(self, _rhs: Bar) -> FooBar {
        println!("> Foo.add(Bar) was called");

        FooBar
    }
}

// By reversing the types, we end up implementing non-commutative addition.
// Here, we make `Add<Foo>` - the trait for addition with a RHS of type `Foo`.
// This block implements the operation: Bar + Foo = BarFoo
impl ops::Add<Foo> for Bar {
    type Output = BarFoo;

    fn add(self, _rhs: Foo) -> BarFoo {
        println!("> Bar.add(Foo) was called");

        BarFoo
    }
}

fn main() {
    println!("Foo + Bar = {:?}", Foo + Bar);
    println!("Bar + Foo = {:?}", Bar + Foo);
}

See Also

Add, Syntax Index

Drop

The Drop trait only has one method: drop, which is called automatically when an object goes out of scope. The main use of the Drop trait is to free the resources that the implementor instance owns.

Box, Vec, String, File, and Process are some examples of types that implement the Drop trait to free resources. The Drop trait can also be manually implemented for any custom data type.

The following example adds a print to console to the drop function to announce when it is called.

struct Droppable {
    name: &'static str,
}

// This trivial implementation of `drop` adds a print to console.
impl Drop for Droppable {
    fn drop(&mut self) {
        println!("> Dropping {}", self.name);
    }
}

fn main() {
    let _a = Droppable { name: "a" };

    // block A
    {
        let _b = Droppable { name: "b" };

        // block B
        {
            let _c = Droppable { name: "c" };
            let _d = Droppable { name: "d" };

            println!("Exiting block B");
        }
        println!("Just exited block B");

        println!("Exiting block A");
    }
    println!("Just exited block A");

    // Variable can be manually dropped using the `drop` function
    drop(_a);
    // TODO ^ Try commenting this line

    println!("end of the main function");

    // `_a` *won't* be `drop`ed again here, because it already has been
    // (manually) `drop`ed
}

Iterators

The Iterator trait is used to implement iterators over collections such as arrays.

The trait requires only a method to be defined for the next element, which may be manually defined in an impl block or automatically defined (as in arrays and ranges).

As a point of convenience for common situations, the for construct turns some collections into iterators using the .into_iter() method.

struct Fibonacci {
    curr: u32,
    next: u32,
}

// Implement `Iterator` for `Fibonacci`.
// The `Iterator` trait only requires a method to be defined for the `next` element.
impl Iterator for Fibonacci {
    type Item = u32;
    
    // Here, we define the sequence using `.curr` and `.next`.
    // The return type is `Option<T>`:
    //     * When the `Iterator` is finished, `None` is returned.
    //     * Otherwise, the next value is wrapped in `Some` and returned.
    fn next(&mut self) -> Option<u32> {
        let new_next = self.curr + self.next;

        self.curr = self.next;
        self.next = new_next;

        // Since there's no endpoint to a Fibonacci sequence, the `Iterator` 
        // will never return `None`, and `Some` is always returned.
        Some(self.curr)
    }
}

// Returns a Fibonacci sequence generator
fn fibonacci() -> Fibonacci {
    Fibonacci { curr: 0, next: 1 }
}

fn main() {
    // `0..3` is an `Iterator` that generates: 0, 1, and 2.
    let mut sequence = 0..3;

    println!("Four consecutive `next` calls on 0..3");
    println!("> {:?}", sequence.next());
    println!("> {:?}", sequence.next());
    println!("> {:?}", sequence.next());
    println!("> {:?}", sequence.next());

    // `for` works through an `Iterator` until it returns `None`.
    // Each `Some` value is unwrapped and bound to a variable (here, `i`).
    println!("Iterate through 0..3 using `for`");
    for i in 0..3 {
        println!("> {}", i);
    }

    // The `take(n)` method reduces an `Iterator` to its first `n` terms.
    println!("The first four terms of the Fibonacci sequence are: ");
    for i in fibonacci().take(4) {
        println!("> {}", i);
    }

    // The `skip(n)` method shortens an `Iterator` by dropping its first `n` terms.
    println!("The next four terms of the Fibonacci sequence are: ");
    for i in fibonacci().skip(4).take(4) {
        println!("> {}", i);
    }

    let array = [1u32, 3, 3, 7];

    // The `iter` method produces an `Iterator` over an array/slice.
    println!("Iterate the following array {:?}", &array);
    for i in array.iter() {
        println!("> {}", i);
    }
}

impl Trait

If your function returns a type that implements MyTrait, you can write its return type as -> impl MyTrait. This can help simplify your type signatures quite a lot!

use std::iter;
use std::vec::IntoIter;

// This function combines two `Vec<i32>` and returns an iterator over it.
// Look how complicated its return type is!
fn combine_vecs_explicit_return_type(
    v: Vec<i32>,
    u: Vec<i32>,
) -> iter::Cycle<iter::Chain<IntoIter<i32>, IntoIter<i32>>> {
    v.into_iter().chain(u.into_iter()).cycle()
}

// This is the exact same function, but its return type uses `impl Trait`.
// Look how much simpler it is!
fn combine_vecs(
    v: Vec<i32>,
    u: Vec<i32>,
) -> impl Iterator<Item=i32> {
    v.into_iter().chain(u.into_iter()).cycle()
}

fn main() {
    let v1 = vec![1, 2, 3];
    let v2 = vec![4, 5];
    let mut v3 = combine_vecs(v1, v2);
    assert_eq!(Some(1), v3.next());
    assert_eq!(Some(2), v3.next());
    assert_eq!(Some(3), v3.next());
    assert_eq!(Some(4), v3.next());
    assert_eq!(Some(5), v3.next());
    println!("all done");
}

More importantly, some Rust types can't be written out. For example, every closure has its own unnamed concrete type. Before impl Trait syntax, you had to allocate on the heap in order to return a closure. But now you can do it all statically, like this:

// Returns a function that adds `y` to its input
fn make_adder_function(y: i32) -> impl Fn(i32) -> i32 {
    let closure = move |x: i32| { x + y };
    closure
}

fn main() {
    let plus_one = make_adder_function(1);
    assert_eq!(plus_one(2), 3);
}

You can also use impl Trait to return an iterator that uses map or filter closures! This makes using map and filter easier. Because closure types don't have names, you can't write out an explicit return type if your function returns iterators with closures. But with impl Trait you can do this easily:

fn double_positives<'a>(numbers: &'a Vec<i32>) -> impl Iterator<Item = i32> + 'a {
    numbers
        .iter()
        .filter(|x| x > &&0)
        .map(|x| x * 2)
}

Clone

When dealing with resources, the default behavior is to transfer them during assignments or function calls. However, sometimes we need to make a copy of the resource as well.

The Clone trait helps us do exactly this. Most commonly, we can use the .clone() method defined by the Clone trait.

// A unit struct without resources
#[derive(Debug, Clone, Copy)]
struct Unit;

// A tuple struct with resources that implements the `Clone` trait
#[derive(Clone, Debug)]
struct Pair(Box<i32>, Box<i32>);

fn main() {
    // Instantiate `Unit`
    let unit = Unit;
    // Copy `Unit`, there are no resources to move
    let copied_unit = unit;

    // Both `Unit`s can be used independently
    println!("original: {:?}", unit);
    println!("copy: {:?}", copied_unit);

    // Instantiate `Pair`
    let pair = Pair(Box::new(1), Box::new(2));
    println!("original: {:?}", pair);

    // Move `pair` into `moved_pair`, moves resources
    let moved_pair = pair;
    println!("moved: {:?}", moved_pair);

    // Error! `pair` has lost its resources
    //println!("original: {:?}", pair);
    // TODO ^ Try uncommenting this line

    // Clone `moved_pair` into `cloned_pair` (resources are included)
    let cloned_pair = moved_pair.clone();
    // Drop the original pair using std::mem::drop
    drop(moved_pair);

    // Error! `moved_pair` has been dropped
    //println!("copy: {:?}", moved_pair);
    // TODO ^ Try uncommenting this line

    // The result from .clone() can still be used!
    println!("clone: {:?}", cloned_pair);
}

Supertraits

Rust doesn't have "inheritance", but you can define a trait as being a superset of another trait. For example:

trait Person {
    fn name(&self) -> String;
}

// Person is a supertrait of Student.
// Implementing Student requires you to also impl Person.
trait Student: Person {
    fn university(&self) -> String;
}

trait Programmer {
    fn fav_language(&self) -> String;
}

// CompSciStudent (computer science student) is a subtrait of both Programmer 
// and Student. Implementing CompSciStudent requires you to impl both supertraits.
trait CompSciStudent: Programmer + Student {
    fn git_username(&self) -> String;
}

fn comp_sci_student_greeting(student: &dyn CompSciStudent) -> String {
    format!(
        "My name is {} and I attend {}. My favorite language is {}. My Git username is {}",
        student.name(),
        student.university(),
        student.fav_language(),
        student.git_username()
    )
}

fn main() {}

See also:

The Rust Programming Language chapter on supertraits

Disambiguating overlapping traits

A type can implement many different traits. What if two traits both require the same name? For example, many traits might have a method named get(). They might even have different return types!

Good news: because each trait implementation gets its own impl block, it's clear which trait's get method you're implementing.

What about when it comes time to call those methods? To disambiguate between them, we have to use Fully Qualified Syntax.

trait UsernameWidget {
    // Get the selected username out of this widget
    fn get(&self) -> String;
}

trait AgeWidget {
    // Get the selected age out of this widget
    fn get(&self) -> u8;
}

// A form with both a UsernameWidget and an AgeWidget
struct Form {
    username: String,
    age: u8,
}

impl UsernameWidget for Form {
    fn get(&self) -> String {
        self.username.clone()
    }
}

impl AgeWidget for Form {
    fn get(&self) -> u8 {
        self.age
    }
}

fn main() {
    let form = Form{
        username: "rustacean".to_owned(),
        age: 28,
    };

    // If you uncomment this line, you'll get an error saying 
    // "multiple `get` found". Because, after all, there are multiple methods
    // named `get`.
    // println!("{}", form.get());

    let username = <Form as UsernameWidget>::get(&form);
    assert_eq!("rustacean".to_owned(), username);
    let age = <Form as AgeWidget>::get(&form);
    assert_eq!(28, age);
}

See also:

The Rust Programming Language chapter on Fully Qualified syntax

macro_rules!

Rust provides a powerful macro system that allows metaprogramming. As you've seen in previous chapters, macros look like functions, except that their name ends with a bang !, but instead of generating a function call, macros are expanded into source code that gets compiled with the rest of the program. However, unlike macros in C and other languages, Rust macros are expanded into abstract syntax trees, rather than string preprocessing, so you don't get unexpected precedence bugs.

Macros are created using the macro_rules! macro.

// This is a simple macro named `say_hello`.
macro_rules! say_hello {
    // `()` indicates that the macro takes no argument.
    () => {
        // The macro will expand into the contents of this block.
        println!("Hello!");
    };
}

fn main() {
    // This call will expand into `println!("Hello");`
    say_hello!()
}

So why are macros useful?

  1. Don't repeat yourself. There are many cases where you may need similar functionality in multiple places but with different types. Often, writing a macro is a useful way to avoid repeating code. (More on this later)

  2. Domain-specific languages. Macros allow you to define special syntax for a specific purpose. (More on this later)

  3. Variadic interfaces. Sometimes you want to define an interface that takes a variable number of arguments. An example is println! which could take any number of arguments, depending on the format string!. (More on this later)

Syntax

In following subsections, we will show how to define macros in Rust. There are three basic ideas:

Designators

The arguments of a macro are prefixed by a dollar sign $ and type annotated with a designator:

macro_rules! create_function {
    // This macro takes an argument of designator `ident` and
    // creates a function named `$func_name`.
    // The `ident` designator is used for variable/function names.
    ($func_name:ident) => {
        fn $func_name() {
            // The `stringify!` macro converts an `ident` into a string.
            println!("You called {:?}()",
                     stringify!($func_name));
        }
    };
}

// Create functions named `foo` and `bar` with the above macro.
create_function!(foo);
create_function!(bar);

macro_rules! print_result {
    // This macro takes an expression of type `expr` and prints
    // it as a string along with its result.
    // The `expr` designator is used for expressions.
    ($expression:expr) => {
        // `stringify!` will convert the expression *as it is* into a string.
        println!("{:?} = {:?}",
                 stringify!($expression),
                 $expression);
    };
}

fn main() {
    foo();
    bar();

    print_result!(1u32 + 1);

    // Recall that blocks are expressions too!
    print_result!({
        let x = 1u32;

        x * x + 2 * x - 1
    });
}

These are some of the available designators:

  • block
  • expr is used for expressions
  • ident is used for variable/function names
  • item
  • literal is used for literal constants
  • pat (pattern)
  • path
  • stmt (statement)
  • tt (token tree)
  • ty (type)
  • vis (visibility qualifier)

For a complete list, see the Rust Reference.

Overload

Macros can be overloaded to accept different combinations of arguments. In that regard, macro_rules! can work similarly to a match block:

// `test!` will compare `$left` and `$right`
// in different ways depending on how you invoke it:
macro_rules! test {
    // Arguments don't need to be separated by a comma.
    // Any template can be used!
    ($left:expr; and $right:expr) => {
        println!("{:?} and {:?} is {:?}",
                 stringify!($left),
                 stringify!($right),
                 $left && $right)
    };
    // ^ each arm must end with a semicolon.
    ($left:expr; or $right:expr) => {
        println!("{:?} or {:?} is {:?}",
                 stringify!($left),
                 stringify!($right),
                 $left || $right)
    };
}

fn main() {
    test!(1i32 + 1 == 2i32; and 2i32 * 2 == 4i32);
    test!(true; or false);
}

Repeat

Macros can use + in the argument list to indicate that an argument may repeat at least once, or *, to indicate that the argument may repeat zero or more times.

In the following example, surrounding the matcher with $(...),+ will match one or more expression, separated by commas. Also note that the semicolon is optional on the last case.

// `find_min!` will calculate the minimum of any number of arguments.
macro_rules! find_min {
    // Base case:
    ($x:expr) => ($x);
    // `$x` followed by at least one `$y,`
    ($x:expr, $($y:expr),+) => (
        // Call `find_min!` on the tail `$y`
        std::cmp::min($x, find_min!($($y),+))
    )
}

fn main() {
    println!("{}", find_min!(1u32));
    println!("{}", find_min!(1u32 + 2, 2u32));
    println!("{}", find_min!(5u32, 2u32 * 3, 4u32));
}

DRY (Don't Repeat Yourself)

Macros allow writing DRY code by factoring out the common parts of functions and/or test suites. Here is an example that implements and tests the +=, *= and -= operators on Vec<T>:

use std::ops::{Add, Mul, Sub};

macro_rules! assert_equal_len {
    // The `tt` (token tree) designator is used for
    // operators and tokens.
    ($a:expr, $b:expr, $func:ident, $op:tt) => {
        assert!($a.len() == $b.len(),
                "{:?}: dimension mismatch: {:?} {:?} {:?}",
                stringify!($func),
                ($a.len(),),
                stringify!($op),
                ($b.len(),));
    };
}

macro_rules! op {
    ($func:ident, $bound:ident, $op:tt, $method:ident) => {
        fn $func<T: $bound<T, Output=T> + Copy>(xs: &mut Vec<T>, ys: &Vec<T>) {
            assert_equal_len!(xs, ys, $func, $op);

            for (x, y) in xs.iter_mut().zip(ys.iter()) {
                *x = $bound::$method(*x, *y);
                // *x = x.$method(*y);
            }
        }
    };
}

// Implement `add_assign`, `mul_assign`, and `sub_assign` functions.
op!(add_assign, Add, +=, add);
op!(mul_assign, Mul, *=, mul);
op!(sub_assign, Sub, -=, sub);

mod test {
    use std::iter;
    macro_rules! test {
        ($func:ident, $x:expr, $y:expr, $z:expr) => {
            #[test]
            fn $func() {
                for size in 0usize..10 {
                    let mut x: Vec<_> = iter::repeat($x).take(size).collect();
                    let y: Vec<_> = iter::repeat($y).take(size).collect();
                    let z: Vec<_> = iter::repeat($z).take(size).collect();

                    super::$func(&mut x, &y);

                    assert_eq!(x, z);
                }
            }
        };
    }

    // Test `add_assign`, `mul_assign`, and `sub_assign`.
    test!(add_assign, 1u32, 2u32, 3u32);
    test!(mul_assign, 2u32, 3u32, 6u32);
    test!(sub_assign, 3u32, 2u32, 1u32);
}
$ rustc --test dry.rs && ./dry
running 3 tests
test test::mul_assign ... ok
test test::add_assign ... ok
test test::sub_assign ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured

Domain Specific Languages (DSLs)

A DSL is a mini "language" embedded in a Rust macro. It is completely valid Rust because the macro system expands into normal Rust constructs, but it looks like a small language. This allows you to define concise or intuitive syntax for some special functionality (within bounds).

Suppose that I want to define a little calculator API. I would like to supply an expression and have the output printed to console.

macro_rules! calculate {
    (eval $e:expr) => {{
        {
            let val: usize = $e; // Force types to be integers
            println!("{} = {}", stringify!{$e}, val);
        }
    }};
}

fn main() {
    calculate! {
        eval 1 + 2 // hehehe `eval` is _not_ a Rust keyword!
    }

    calculate! {
        eval (1 + 2) * (3 / 4)
    }
}

Output:

1 + 2 = 3
(1 + 2) * (3 / 4) = 0

This was a very simple example, but much more complex interfaces have been developed, such as lazy_static or clap.

Also, note the two pairs of braces in the macro. The outer ones are part of the syntax of macro_rules!, in addition to () or [].

Variadic Interfaces

A variadic interface takes an arbitrary number of arguments. For example, println! can take an arbitrary number of arguments, as determined by the format string.

We can extend our calculate! macro from the previous section to be variadic:

macro_rules! calculate {
    // The pattern for a single `eval`
    (eval $e:expr) => {{
        {
            let val: usize = $e; // Force types to be integers
            println!("{} = {}", stringify!{$e}, val);
        }
    }};

    // Decompose multiple `eval`s recursively
    (eval $e:expr, $(eval $es:expr),+) => {{
        calculate! { eval $e }
        calculate! { $(eval $es),+ }
    }};
}

fn main() {
    calculate! { // Look ma! Variadic `calculate!`!
        eval 1 + 2,
        eval 3 + 4,
        eval (2 * 3) + 1
    }
}

Output:

1 + 2 = 3
3 + 4 = 7
(2 * 3) + 1 = 7

Error handling

Error handling is the process of handling the possibility of failure. For example, failing to read a file and then continuing to use that bad input would clearly be problematic. Noticing and explicitly managing those errors saves the rest of the program from various pitfalls.

There are various ways to deal with errors in Rust, which are described in the following subchapters. They all have more or less subtle differences and different use cases. As a rule of thumb:

An explicit panic is mainly useful for tests and dealing with unrecoverable errors. For prototyping it can be useful, for example when dealing with functions that haven't been implemented yet, but in those cases the more descriptive unimplemented is better. In tests panic is a reasonable way to explicitly fail.

The Option type is for when a value is optional or when the lack of a value is not an error condition. For example the parent of a directory - / and C: don't have one. When dealing with Options, unwrap is fine for prototyping and cases where it's absolutely certain that there is guaranteed to be a value. However expect is more useful since it lets you specify an error message in case something goes wrong anyway.

When there is a chance that things do go wrong and the caller has to deal with the problem, use Result. You can unwrap and expect them as well (please don't do that unless it's a test or quick prototype).

For a more rigorous discussion of error handling, refer to the error handling section in the official book.

panic

The simplest error handling mechanism we will see is panic. It prints an error message, starts unwinding the stack, and usually exits the program. Here, we explicitly call panic on our error condition:

fn drink(beverage: &str) {
    // You shouldn't drink too much sugary beverages.
    if beverage == "lemonade" { panic!("AAAaaaaa!!!!"); }

    println!("Some refreshing {} is all I need.", beverage);
}

fn main() {
    drink("water");
    drink("lemonade");
}

abort and unwind

The previous section illustrates the error handling mechanism panic. Different code paths can be conditionally compiled based on the panic setting. The current values available are unwind and abort.

Building on the prior lemonade example, we explicitly use the panic strategy to exercise different lines of code.

fn drink(beverage: &str) {
    // You shouldn't drink too much sugary beverages.
    if beverage == "lemonade" {
        if cfg!(panic = "abort") {
            println!("This is not your party. Run!!!!");
        } else {
            println!("Spit it out!!!!");
        }
    } else {
        println!("Some refreshing {} is all I need.", beverage);
    }
}

fn main() {
    drink("water");
    drink("lemonade");
}

Here is another example focusing on rewriting drink() and explicitly use the unwind keyword.

#[cfg(panic = "unwind")]
fn ah() {
    println!("Spit it out!!!!");
}

#[cfg(not(panic = "unwind"))]
fn ah() {
    println!("This is not your party. Run!!!!");
}

fn drink(beverage: &str) {
    if beverage == "lemonade" {
        ah();
    } else {
        println!("Some refreshing {} is all I need.", beverage);
    }
}

fn main() {
    drink("water");
    drink("lemonade");
}

The panic strategy can be set from the command line by using abort or unwind.

rustc  lemonade.rs -C panic=abort

Option & unwrap

In the last example, we showed that we can induce program failure at will. We told our program to panic if the royal received an inappropriate gift - a snake. But what if the royal expected a gift and didn't receive one? That case would be just as bad, so it needs to be handled!

We could test this against the null string ("") as we do with a snake. Since we're using Rust, let's instead have the compiler point out cases where there's no gift.

An enum called Option<T> in the std library is used when absence is a possibility. It manifests itself as one of two "options":

  • Some(T): An element of type T was found
  • None: No element was found

These cases can either be explicitly handled via match or implicitly with unwrap. Implicit handling will either return the inner element or panic.

Note that it's possible to manually customize panic with expect, but unwrap otherwise leaves us with a less meaningful output than explicit handling. In the following example, explicit handling yields a more controlled result while retaining the option to panic if desired.

// The commoner has seen it all, and can handle any gift well.
// All gifts are handled explicitly using `match`.
fn give_commoner(gift: Option<&str>) {
    // Specify a course of action for each case.
    match gift {
        Some("snake") => println!("Yuck! I'm putting this snake back in the forest."),
        Some(inner)   => println!("{}? How nice.", inner),
        None          => println!("No gift? Oh well."),
    }
}

// Our sheltered royal will `panic` at the sight of snakes.
// All gifts are handled implicitly using `unwrap`.
fn give_royal(gift: Option<&str>) {
    // `unwrap` returns a `panic` when it receives a `None`.
    let inside = gift.unwrap();
    if inside == "snake" { panic!("AAAaaaaa!!!!"); }

    println!("I love {}s!!!!!", inside);
}

fn main() {
    let food  = Some("cabbage");
    let snake = Some("snake");
    let void  = None;

    give_commoner(food);
    give_commoner(snake);
    give_commoner(void);

    let bird = Some("robin");
    let nothing = None;

    give_royal(bird);
    give_royal(nothing);
}

Unpacking options with ?

You can unpack Options by using match statements, but it's often easier to use the ? operator. If x is an Option, then evaluating x? will return the underlying value if x is Some, otherwise it will terminate whatever function is being executed and return None.

fn next_birthday(current_age: Option<u8>) -> Option<String> {
	// If `current_age` is `None`, this returns `None`.
	// If `current_age` is `Some`, the inner `u8` gets assigned to `next_age`
    let next_age: u8 = current_age?;
    Some(format!("Next year I will be {}", next_age))
}

You can chain many ?s together to make your code much more readable.

struct Person {
    job: Option<Job>,
}

#[derive(Clone, Copy)]
struct Job {
    phone_number: Option<PhoneNumber>,
}

#[derive(Clone, Copy)]
struct PhoneNumber {
    area_code: Option<u8>,
    number: u32,
}

impl Person {

    // Gets the area code of the phone number of the person's job, if it exists.
    fn work_phone_area_code(&self) -> Option<u8> {
        // This would need many nested `match` statements without the `?` operator.
        // It would take a lot more code - try writing it yourself and see which
        // is easier.
        self.job?.phone_number?.area_code
    }
}

fn main() {
    let p = Person {
        job: Some(Job {
            phone_number: Some(PhoneNumber {
                area_code: Some(61),
                number: 439222222,
            }),
        }),
    };

    assert_eq!(p.work_phone_area_code(), Some(61));
}

Combinators: map

match is a valid method for handling Options. However, you may eventually find heavy usage tedious, especially with operations only valid with an input. In these cases, combinators can be used to manage control flow in a modular fashion.

Option has a built in method called map(), a combinator for the simple mapping of Some -> Some and None -> None. Multiple map() calls can be chained together for even more flexibility.

In the following example, process() replaces all functions previous to it while staying compact.

#![allow(dead_code)]

#[derive(Debug)] enum Food { Apple, Carrot, Potato }

#[derive(Debug)] struct Peeled(Food);
#[derive(Debug)] struct Chopped(Food);
#[derive(Debug)] struct Cooked(Food);

// Peeling food. If there isn't any, then return `None`.
// Otherwise, return the peeled food.
fn peel(food: Option<Food>) -> Option<Peeled> {
    match food {
        Some(food) => Some(Peeled(food)),
        None       => None,
    }
}

// Chopping food. If there isn't any, then return `None`.
// Otherwise, return the chopped food.
fn chop(peeled: Option<Peeled>) -> Option<Chopped> {
    match peeled {
        Some(Peeled(food)) => Some(Chopped(food)),
        None               => None,
    }
}

// Cooking food. Here, we showcase `map()` instead of `match` for case handling.
fn cook(chopped: Option<Chopped>) -> Option<Cooked> {
    chopped.map(|Chopped(food)| Cooked(food))
}

// A function to peel, chop, and cook food all in sequence.
// We chain multiple uses of `map()` to simplify the code.
fn process(food: Option<Food>) -> Option<Cooked> {
    food.map(|f| Peeled(f))
        .map(|Peeled(f)| Chopped(f))
        .map(|Chopped(f)| Cooked(f))
}

// Check whether there's food or not before trying to eat it!
fn eat(food: Option<Cooked>) {
    match food {
        Some(food) => println!("Mmm. I love {:?}", food),
        None       => println!("Oh no! It wasn't edible."),
    }
}

fn main() {
    let apple = Some(Food::Apple);
    let carrot = Some(Food::Carrot);
    let potato = None;

    let cooked_apple = cook(chop(peel(apple)));
    let cooked_carrot = cook(chop(peel(carrot)));
    // Let's try the simpler looking `process()` now.
    let cooked_potato = process(potato);

    eat(cooked_apple);
    eat(cooked_carrot);
    eat(cooked_potato);
}

See also:

closures, Option, Option::map()

Combinators: and_then

map() was described as a chainable way to simplify match statements. However, using map() on a function that returns an Option<T> results in the nested Option<Option<T>>. Chaining multiple calls together can then become confusing. That's where another combinator called and_then(), known in some languages as flatmap, comes in.

and_then() calls its function input with the wrapped value and returns the result. If the Option is None, then it returns None instead.

In the following example, cookable_v2() results in an Option<Food>. Using map() instead of and_then() would have given an Option<Option<Food>>, which is an invalid type for eat().

#![allow(dead_code)]

#[derive(Debug)] enum Food { CordonBleu, Steak, Sushi }
#[derive(Debug)] enum Day { Monday, Tuesday, Wednesday }

// We don't have the ingredients to make Sushi.
fn have_ingredients(food: Food) -> Option<Food> {
    match food {
        Food::Sushi => None,
        _           => Some(food),
    }
}

// We have the recipe for everything except Cordon Bleu.
fn have_recipe(food: Food) -> Option<Food> {
    match food {
        Food::CordonBleu => None,
        _                => Some(food),
    }
}

// To make a dish, we need both the recipe and the ingredients.
// We can represent the logic with a chain of `match`es:
fn cookable_v1(food: Food) -> Option<Food> {
    match have_recipe(food) {
        None       => None,
        Some(food) => match have_ingredients(food) {
            None       => None,
            Some(food) => Some(food),
        },
    }
}

// This can conveniently be rewritten more compactly with `and_then()`:
fn cookable_v2(food: Food) -> Option<Food> {
    have_recipe(food).and_then(have_ingredients)
}

fn eat(food: Food, day: Day) {
    match cookable_v2(food) {
        Some(food) => println!("Yay! On {:?} we get to eat {:?}.", day, food),
        None       => println!("Oh no. We don't get to eat on {:?}?", day),
    }
}

fn main() {
    let (cordon_bleu, steak, sushi) = (Food::CordonBleu, Food::Steak, Food::Sushi);

    eat(cordon_bleu, Day::Monday);
    eat(steak, Day::Tuesday);
    eat(sushi, Day::Wednesday);
}

See also:

closures, Option, and Option::and_then()

Unpacking options and defaults

There is more than one way to unpack an Option and fall back on a default if it is None. To choose the one that meets our needs, we need to consider the following:

  • do we need eager or lazy evaluation?
  • do we need to keep the original empty value intact, or modify it in place?

or() is chainable, evaluates eagerly, keeps empty value intact

or()is chainable and eagerly evaluates its argument, as is shown in the following example. Note that because or's arguments are evaluated eagerly, the variable passed to or is moved.

#[derive(Debug)] 
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let apple = Some(Fruit::Apple);
    let orange = Some(Fruit::Orange);
    let no_fruit: Option<Fruit> = None;

    let first_available_fruit = no_fruit.or(orange).or(apple);
    println!("first_available_fruit: {:?}", first_available_fruit);
    // first_available_fruit: Some(Orange)

    // `or` moves its argument.
    // In the example above, `or(orange)` returned a `Some`, so `or(apple)` was not invoked.
    // But the variable named `apple` has been moved regardless, and cannot be used anymore.
    // println!("Variable apple was moved, so this line won't compile: {:?}", apple);
    // TODO: uncomment the line above to see the compiler error
 }

or_else() is chainable, evaluates lazily, keeps empty value intact

Another alternative is to use or_else, which is also chainable, and evaluates lazily, as is shown in the following example:

#[derive(Debug)] 
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let no_fruit: Option<Fruit> = None;
    let get_kiwi_as_fallback = || {
        println!("Providing kiwi as fallback");
        Some(Fruit::Kiwi)
    };
    let get_lemon_as_fallback = || {
        println!("Providing lemon as fallback");
        Some(Fruit::Lemon)
    };

    let first_available_fruit = no_fruit
        .or_else(get_kiwi_as_fallback)
        .or_else(get_lemon_as_fallback);
    println!("first_available_fruit: {:?}", first_available_fruit);
    // Providing kiwi as fallback
    // first_available_fruit: Some(Kiwi)
}

get_or_insert() evaluates eagerly, modifies empty value in place

To make sure that an Option contains a value, we can use get_or_insert to modify it in place with a fallback value, as is shown in the following example. Note that get_or_insert eagerly evaluates its parameter, so variable apple is moved:

#[derive(Debug)]
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let mut my_fruit: Option<Fruit> = None;
    let apple = Fruit::Apple;
    let first_available_fruit = my_fruit.get_or_insert(apple);
    println!("first_available_fruit is: {:?}", first_available_fruit);
    println!("my_fruit is: {:?}", my_fruit);
    // first_available_fruit is: Apple
    // my_fruit is: Some(Apple)
    //println!("Variable named `apple` is moved: {:?}", apple);
    // TODO: uncomment the line above to see the compiler error
}

get_or_insert_with() evaluates lazily, modifies empty value in place

Instead of explicitly providing a value to fall back on, we can pass a closure to get_or_insert_with, as follows:

#[derive(Debug)] 
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let mut my_fruit: Option<Fruit> = None;
    let get_lemon_as_fallback = || {
        println!("Providing lemon as fallback");
        Fruit::Lemon
    };
    let first_available_fruit = my_fruit
        .get_or_insert_with(get_lemon_as_fallback);
    println!("first_available_fruit is: {:?}", first_available_fruit);
    println!("my_fruit is: {:?}", my_fruit);
    // Providing lemon as fallback
    // first_available_fruit is: Lemon
    // my_fruit is: Some(Lemon)

    // If the Option has a value, it is left unchanged, and the closure is not invoked
    let mut my_apple = Some(Fruit::Apple);
    let should_be_apple = my_apple.get_or_insert_with(get_lemon_as_fallback);
    println!("should_be_apple is: {:?}", should_be_apple);
    println!("my_apple is unchanged: {:?}", my_apple);
    // The output is a follows. Note that the closure `get_lemon_as_fallback` is not invoked
    // should_be_apple is: Apple
    // my_apple is unchanged: Some(Apple)
}

See also:

closures, get_or_insert, get_or_insert_with, moved variables, or, or_else

Result

Result is a richer version of the Option type that describes possible error instead of possible absence.

That is, Result<T, E> could have one of two outcomes:

  • Ok(T): An element T was found
  • Err(E): An error was found with element E

By convention, the expected outcome is Ok while the unexpected outcome is Err.

Like Option, Result has many methods associated with it. unwrap(), for example, either yields the element T or panics. For case handling, there are many combinators between Result and Option that overlap.

In working with Rust, you will likely encounter methods that return the Result type, such as the parse() method. It might not always be possible to parse a string into the other type, so parse() returns a Result indicating possible failure.

Let's see what happens when we successfully and unsuccessfully parse() a string:

fn multiply(first_number_str: &str, second_number_str: &str) -> i32 {
    // Let's try using `unwrap()` to get the number out. Will it bite us?
    let first_number = first_number_str.parse::<i32>().unwrap();
    let second_number = second_number_str.parse::<i32>().unwrap();
    first_number * second_number
}

fn main() {
    let twenty = multiply("10", "2");
    println!("double is {}", twenty);

    let tt = multiply("t", "2");
    println!("double is {}", tt);
}

In the unsuccessful case, parse() leaves us with an error for unwrap() to panic on. Additionally, the panic exits our program and provides an unpleasant error message.

To improve the quality of our error message, we should be more specific about the return type and consider explicitly handling the error.

Using Result in main

The Result type can also be the return type of the main function if specified explicitly. Typically the main function will be of the form:

fn main() {
    println!("Hello World!");
}

However main is also able to have a return type of Result. If an error occurs within the main function it will return an error code and print a debug representation of the error (using the Debug trait). The following example shows such a scenario and touches on aspects covered in the following section.

use std::num::ParseIntError;

fn main() -> Result<(), ParseIntError> {
    let number_str = "10";
    let number = match number_str.parse::<i32>() {
        Ok(number)  => number,
        Err(e) => return Err(e),
    };
    println!("{}", number);
    Ok(())
}

map for Result

Panicking in the previous example's multiply does not make for robust code. Generally, we want to return the error to the caller so it can decide what is the right way to respond to errors.

We first need to know what kind of error type we are dealing with. To determine the Err type, we look to parse(), which is implemented with the FromStr trait for i32. As a result, the Err type is specified as ParseIntError.

In the example below, the straightforward match statement leads to code that is overall more cumbersome.

use std::num::ParseIntError;

// With the return type rewritten, we use pattern matching without `unwrap()`.
fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    match first_number_str.parse::<i32>() {
        Ok(first_number)  => {
            match second_number_str.parse::<i32>() {
                Ok(second_number)  => {
                    Ok(first_number * second_number)
                },
                Err(e) => Err(e),
            }
        },
        Err(e) => Err(e),
    }
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    // This still presents a reasonable answer.
    let twenty = multiply("10", "2");
    print(twenty);

    // The following now provides a much more helpful error message.
    let tt = multiply("t", "2");
    print(tt);
}

Luckily, Option's map, and_then, and many other combinators are also implemented for Result. Result contains a complete listing.

use std::num::ParseIntError;

// As with `Option`, we can use combinators such as `map()`.
// This function is otherwise identical to the one above and reads:
// Modify n if the value is valid, otherwise pass on the error.
fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    first_number_str.parse::<i32>().and_then(|first_number| {
        second_number_str.parse::<i32>().map(|second_number| first_number * second_number)
    })
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    // This still presents a reasonable answer.
    let twenty = multiply("10", "2");
    print(twenty);

    // The following now provides a much more helpful error message.
    let tt = multiply("t", "2");
    print(tt);
}

aliases for Result

How about when we want to reuse a specific Result type many times? Recall that Rust allows us to create aliases. Conveniently, we can define one for the specific Result in question.

At a module level, creating aliases can be particularly helpful. Errors found in a specific module often have the same Err type, so a single alias can succinctly define all associated Results. This is so useful that the std library even supplies one: io::Result!

Here's a quick example to show off the syntax:

use std::num::ParseIntError;

// Define a generic alias for a `Result` with the error type `ParseIntError`.
type AliasedResult<T> = Result<T, ParseIntError>;

// Use the above alias to refer to our specific `Result` type.
fn multiply(first_number_str: &str, second_number_str: &str) -> AliasedResult<i32> {
    first_number_str.parse::<i32>().and_then(|first_number| {
        second_number_str.parse::<i32>().map(|second_number| first_number * second_number)
    })
}

// Here, the alias again allows us to save some space.
fn print(result: AliasedResult<i32>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

See also:

io::Result

Early returns

In the previous example, we explicitly handled the errors using combinators. Another way to deal with this case analysis is to use a combination of match statements and early returns.

That is, we can simply stop executing the function and return the error if one occurs. For some, this form of code can be easier to both read and write. Consider this version of the previous example, rewritten using early returns:

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = match first_number_str.parse::<i32>() {
        Ok(first_number)  => first_number,
        Err(e) => return Err(e),
    };

    let second_number = match second_number_str.parse::<i32>() {
        Ok(second_number)  => second_number,
        Err(e) => return Err(e),
    };

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

At this point, we've learned to explicitly handle errors using combinators and early returns. While we generally want to avoid panicking, explicitly handling all of our errors is cumbersome.

In the next section, we'll introduce ? for the cases where we simply need to unwrap without possibly inducing panic.

Introducing ?

Sometimes we just want the simplicity of unwrap without the possibility of a panic. Until now, unwrap has forced us to nest deeper and deeper when what we really wanted was to get the variable out. This is exactly the purpose of ?.

Upon finding an Err, there are two valid actions to take:

  1. panic! which we already decided to try to avoid if possible
  2. return because an Err means it cannot be handled

? is almost1 exactly equivalent to an unwrap which returns instead of panicking on Errs. Let's see how we can simplify the earlier example that used combinators:

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = first_number_str.parse::<i32>()?;
    let second_number = second_number_str.parse::<i32>()?;

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

The try! macro

Before there was ?, the same functionality was achieved with the try! macro. The ? operator is now recommended, but you may still find try! when looking at older code. The same multiply function from the previous example would look like this using try!:

// To compile and run this example without errors, while using Cargo, change the value 
// of the `edition` field, in the `[package]` section of the `Cargo.toml` file, to "2015".

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = try!(first_number_str.parse::<i32>());
    let second_number = try!(second_number_str.parse::<i32>());

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

  1. See re-enter ? for more details.

Multiple error types

The previous examples have always been very convenient; Results interact with other Results and Options interact with other Options.

Sometimes an Option needs to interact with a Result, or a Result<T, Error1> needs to interact with a Result<T, Error2>. In those cases, we want to manage our different error types in a way that makes them composable and easy to interact with.

In the following code, two instances of unwrap generate different error types. Vec::first returns an Option, while parse::<i32> returns a Result<i32, ParseIntError>:

fn double_first(vec: Vec<&str>) -> i32 {
    let first = vec.first().unwrap(); // Generate error 1
    2 * first.parse::<i32>().unwrap() // Generate error 2
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    println!("The first doubled is {}", double_first(numbers));

    println!("The first doubled is {}", double_first(empty));
    // Error 1: the input vector is empty

    println!("The first doubled is {}", double_first(strings));
    // Error 2: the element doesn't parse to a number
}

Over the next sections, we'll see several strategies for handling these kind of problems.

Pulling Results out of Options

The most basic way of handling mixed error types is to just embed them in each other.

use std::num::ParseIntError;

fn double_first(vec: Vec<&str>) -> Option<Result<i32, ParseIntError>> {
    vec.first().map(|first| {
        first.parse::<i32>().map(|n| 2 * n)
    })
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    println!("The first doubled is {:?}", double_first(numbers));

    println!("The first doubled is {:?}", double_first(empty));
    // Error 1: the input vector is empty

    println!("The first doubled is {:?}", double_first(strings));
    // Error 2: the element doesn't parse to a number
}

There are times when we'll want to stop processing on errors (like with ?) but keep going when the Option is None. A couple of combinators come in handy to swap the Result and Option.

use std::num::ParseIntError;

fn double_first(vec: Vec<&str>) -> Result<Option<i32>, ParseIntError> {
    let opt = vec.first().map(|first| {
        first.parse::<i32>().map(|n| 2 * n)
    });

    opt.map_or(Ok(None), |r| r.map(Some))
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    println!("The first doubled is {:?}", double_first(numbers));
    println!("The first doubled is {:?}", double_first(empty));
    println!("The first doubled is {:?}", double_first(strings));
}

Defining an error type

Sometimes it simplifies the code to mask all of the different errors with a single type of error. We'll show this with a custom error.

Rust allows us to define our own error types. In general, a "good" error type:

  • Represents different errors with the same type
  • Presents nice error messages to the user
  • Is easy to compare with other types
    • Good: Err(EmptyVec)
    • Bad: Err("Please use a vector with at least one element".to_owned())
  • Can hold information about the error
    • Good: Err(BadChar(c, position))
    • Bad: Err("+ cannot be used here".to_owned())
  • Composes well with other errors
use std::fmt;

type Result<T> = std::result::Result<T, DoubleError>;

// Define our error types. These may be customized for our error handling cases.
// Now we will be able to write our own errors, defer to an underlying error
// implementation, or do something in between.
#[derive(Debug, Clone)]
struct DoubleError;

// Generation of an error is completely separate from how it is displayed.
// There's no need to be concerned about cluttering complex logic with the display style.
//
// Note that we don't store any extra info about the errors. This means we can't state
// which string failed to parse without modifying our types to carry that information.
impl fmt::Display for DoubleError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    vec.first()
        // Change the error to our new type.
        .ok_or(DoubleError)
        .and_then(|s| {
            s.parse::<i32>()
                // Update to the new error type here also.
                .map_err(|_| DoubleError)
                .map(|i| 2 * i)
        })
}

fn print(result: Result<i32>) {
    match result {
        Ok(n) => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

Boxing errors

A way to write simple code while preserving the original errors is to Box them. The drawback is that the underlying error type is only known at runtime and not statically determined.

The stdlib helps in boxing our errors by having Box implement conversion from any type that implements the Error trait into the trait object Box<Error>, via From.

use std::error;
use std::fmt;

// Change the alias to `Box<error::Error>`.
type Result<T> = std::result::Result<T, Box<dyn error::Error>>;

#[derive(Debug, Clone)]
struct EmptyVec;

impl fmt::Display for EmptyVec {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

impl error::Error for EmptyVec {}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    vec.first()
        .ok_or_else(|| EmptyVec.into()) // Converts to Box
        .and_then(|s| {
            s.parse::<i32>()
                .map_err(|e| e.into()) // Converts to Box
                .map(|i| 2 * i)
        })
}

fn print(result: Result<i32>) {
    match result {
        Ok(n) => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

See also:

Dynamic dispatch and Error trait

Other uses of ?

Notice in the previous example that our immediate reaction to calling parse is to map the error from a library error into a boxed error:

.and_then(|s| s.parse::<i32>()
    .map_err(|e| e.into())

Since this is a simple and common operation, it would be convenient if it could be elided. Alas, because and_then is not sufficiently flexible, it cannot. However, we can instead use ?.

? was previously explained as either unwrap or return Err(err). This is only mostly true. It actually means unwrap or return Err(From::from(err)). Since From::from is a conversion utility between different types, this means that if you ? where the error is convertible to the return type, it will convert automatically.

Here, we rewrite the previous example using ?. As a result, the map_err will go away when From::from is implemented for our error type:

use std::error;
use std::fmt;

// Change the alias to `Box<dyn error::Error>`.
type Result<T> = std::result::Result<T, Box<dyn error::Error>>;

#[derive(Debug)]
struct EmptyVec;

impl fmt::Display for EmptyVec {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

impl error::Error for EmptyVec {}

// The same structure as before but rather than chain all `Results`
// and `Options` along, we `?` to get the inner value out immediately.
fn double_first(vec: Vec<&str>) -> Result<i32> {
    let first = vec.first().ok_or(EmptyVec)?;
    let parsed = first.parse::<i32>()?;
    Ok(2 * parsed)
}

fn print(result: Result<i32>) {
    match result {
        Ok(n)  => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

This is actually fairly clean now. Compared with the original panic, it is very similar to replacing the unwrap calls with ? except that the return types are Result. As a result, they must be destructured at the top level.

See also:

From::from and ?

Wrapping errors

An alternative to boxing errors is to wrap them in your own error type.

use std::error;
use std::error::Error as _;
use std::num::ParseIntError;
use std::fmt;

type Result<T> = std::result::Result<T, DoubleError>;

#[derive(Debug)]
enum DoubleError {
    EmptyVec,
    // We will defer to the parse error implementation for their error.
    // Supplying extra info requires adding more data to the type.
    Parse(ParseIntError),
}

impl fmt::Display for DoubleError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            DoubleError::EmptyVec =>
                write!(f, "please use a vector with at least one element"),
            // The wrapped error contains additional information and is available
            // via the source() method.
            DoubleError::Parse(..) =>
                write!(f, "the provided string could not be parsed as int"),
        }
    }
}

impl error::Error for DoubleError {
    fn source(&self) -> Option<&(dyn error::Error + 'static)> {
        match *self {
            DoubleError::EmptyVec => None,
            // The cause is the underlying implementation error type. Is implicitly
            // cast to the trait object `&error::Error`. This works because the
            // underlying type already implements the `Error` trait.
            DoubleError::Parse(ref e) => Some(e),
        }
    }
}

// Implement the conversion from `ParseIntError` to `DoubleError`.
// This will be automatically called by `?` if a `ParseIntError`
// needs to be converted into a `DoubleError`.
impl From<ParseIntError> for DoubleError {
    fn from(err: ParseIntError) -> DoubleError {
        DoubleError::Parse(err)
    }
}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    let first = vec.first().ok_or(DoubleError::EmptyVec)?;
    // Here we implicitly use the `ParseIntError` implementation of `From` (which
    // we defined above) in order to create a `DoubleError`.
    let parsed = first.parse::<i32>()?;

    Ok(2 * parsed)
}

fn print(result: Result<i32>) {
    match result {
        Ok(n)  => println!("The first doubled is {}", n),
        Err(e) => {
            println!("Error: {}", e);
            if let Some(source) = e.source() {
                println!("  Caused by: {}", source);
            }
        },
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

This adds a bit more boilerplate for handling errors and might not be needed in all applications. There are some libraries that can take care of the boilerplate for you.

See also:

From::from and Enums

Iterating over Results

An Iter::map operation might fail, for example:

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let numbers: Vec<_> = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .collect();
    println!("Results: {:?}", numbers);
}

Let's step through strategies for handling this.

Ignore the failed items with filter_map()

filter_map calls a function and filters out the results that are None.

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let numbers: Vec<_> = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .filter_map(Result::ok)
        .collect();
    println!("Results: {:?}", numbers);
}

Fail the entire operation with collect()

Result implements FromIter so that a vector of results (Vec<Result<T, E>>) can be turned into a result with a vector (Result<Vec<T>, E>). Once an Result::Err is found, the iteration will terminate.

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let numbers: Result<Vec<_>, _> = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .collect();
    println!("Results: {:?}", numbers);
}

This same technique can be used with Option.

Collect all valid values and failures with partition()

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let (numbers, errors): (Vec<_>, Vec<_>) = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .partition(Result::is_ok);
    println!("Numbers: {:?}", numbers);
    println!("Errors: {:?}", errors);
}

When you look at the results, you'll note that everything is still wrapped in Result. A little more boilerplate is needed for this.

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let (numbers, errors): (Vec<_>, Vec<_>) = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .partition(Result::is_ok);
    let numbers: Vec<_> = numbers.into_iter().map(Result::unwrap).collect();
    let errors: Vec<_> = errors.into_iter().map(Result::unwrap_err).collect();
    println!("Numbers: {:?}", numbers);
    println!("Errors: {:?}", errors);
}

Std library types

The std library provides many custom types which expands drastically on the primitives. Some of these include:

  • growable Strings like: "hello world"
  • growable vectors: [1, 2, 3]
  • optional types: Option<i32>
  • error handling types: Result<i32, i32>
  • heap allocated pointers: Box<i32>

See also:

primitives and the std library

Box, stack and heap

All values in Rust are stack allocated by default. Values can be boxed (allocated on the heap) by creating a Box<T>. A box is a smart pointer to a heap allocated value of type T. When a box goes out of scope, its destructor is called, the inner object is destroyed, and the memory on the heap is freed.

Boxed values can be dereferenced using the * operator; this removes one layer of indirection.

use std::mem;

#[allow(dead_code)]
#[derive(Debug, Clone, Copy)]
struct Point {
    x: f64,
    y: f64,
}

// A Rectangle can be specified by where its top left and bottom right 
// corners are in space
#[allow(dead_code)]
struct Rectangle {
    top_left: Point,
    bottom_right: Point,
}

fn origin() -> Point {
    Point { x: 0.0, y: 0.0 }
}

fn boxed_origin() -> Box<Point> {
    // Allocate this point on the heap, and return a pointer to it
    Box::new(Point { x: 0.0, y: 0.0 })
}

fn main() {
    // (all the type annotations are superfluous)
    // Stack allocated variables
    let point: Point = origin();
    let rectangle: Rectangle = Rectangle {
        top_left: origin(),
        bottom_right: Point { x: 3.0, y: -4.0 }
    };

    // Heap allocated rectangle
    let boxed_rectangle: Box<Rectangle> = Box::new(Rectangle {
        top_left: origin(),
        bottom_right: Point { x: 3.0, y: -4.0 },
    });

    // The output of functions can be boxed
    let boxed_point: Box<Point> = Box::new(origin());

    // Double indirection
    let box_in_a_box: Box<Box<Point>> = Box::new(boxed_origin());

    println!("Point occupies {} bytes on the stack",
             mem::size_of_val(&point));
    println!("Rectangle occupies {} bytes on the stack",
             mem::size_of_val(&rectangle));

    // box size == pointer size
    println!("Boxed point occupies {} bytes on the stack",
             mem::size_of_val(&boxed_point));
    println!("Boxed rectangle occupies {} bytes on the stack",
             mem::size_of_val(&boxed_rectangle));
    println!("Boxed box occupies {} bytes on the stack",
             mem::size_of_val(&box_in_a_box));

    // Copy the data contained in `boxed_point` into `unboxed_point`
    let unboxed_point: Point = *boxed_point;
    println!("Unboxed point occupies {} bytes on the stack",
             mem::size_of_val(&unboxed_point));
}

Vectors

Vectors are re-sizable arrays. Like slices, their size is not known at compile time, but they can grow or shrink at any time. A vector is represented using 3 parameters:

  • pointer to the data
  • length
  • capacity

The capacity indicates how much memory is reserved for the vector. The vector can grow as long as the length is smaller than the capacity. When this threshold needs to be surpassed, the vector is reallocated with a larger capacity.

fn main() {
    // Iterators can be collected into vectors
    let collected_iterator: Vec<i32> = (0..10).collect();
    println!("Collected (0..10) into: {:?}", collected_iterator);

    // The `vec!` macro can be used to initialize a vector
    let mut xs = vec![1i32, 2, 3];
    println!("Initial vector: {:?}", xs);

    // Insert new element at the end of the vector
    println!("Push 4 into the vector");
    xs.push(4);
    println!("Vector: {:?}", xs);

    // Error! Immutable vectors can't grow
    collected_iterator.push(0);
    // FIXME ^ Comment out this line

    // The `len` method yields the number of elements currently stored in a vector
    println!("Vector length: {}", xs.len());

    // Indexing is done using the square brackets (indexing starts at 0)
    println!("Second element: {}", xs[1]);

    // `pop` removes the last element from the vector and returns it
    println!("Pop last element: {:?}", xs.pop());

    // Out of bounds indexing yields a panic
    println!("Fourth element: {}", xs[3]);
    // FIXME ^ Comment out this line

    // `Vector`s can be easily iterated over
    println!("Contents of xs:");
    for x in xs.iter() {
        println!("> {}", x);
    }

    // A `Vector` can also be iterated over while the iteration
    // count is enumerated in a separate variable (`i`)
    for (i, x) in xs.iter().enumerate() {
        println!("In position {} we have value {}", i, x);
    }

    // Thanks to `iter_mut`, mutable `Vector`s can also be iterated
    // over in a way that allows modifying each value
    for x in xs.iter_mut() {
        *x *= 3;
    }
    println!("Updated vector: {:?}", xs);
}

More Vec methods can be found under the std::vec module

Strings

There are two types of strings in Rust: String and &str.

A String is stored as a vector of bytes (Vec<u8>), but guaranteed to always be a valid UTF-8 sequence. String is heap allocated, growable and not null terminated.

&str is a slice (&[u8]) that always points to a valid UTF-8 sequence, and can be used to view into a String, just like &[T] is a view into Vec<T>.

fn main() {
    // (all the type annotations are superfluous)
    // A reference to a string allocated in read only memory
    let pangram: &'static str = "the quick brown fox jumps over the lazy dog";
    println!("Pangram: {}", pangram);

    // Iterate over words in reverse, no new string is allocated
    println!("Words in reverse");
    for word in pangram.split_whitespace().rev() {
        println!("> {}", word);
    }

    // Copy chars into a vector, sort and remove duplicates
    let mut chars: Vec<char> = pangram.chars().collect();
    chars.sort();
    chars.dedup();

    // Create an empty and growable `String`
    let mut string = String::new();
    for c in chars {
        // Insert a char at the end of string
        string.push(c);
        // Insert a string at the end of string
        string.push_str(", ");
    }

    // The trimmed string is a slice to the original string, hence no new
    // allocation is performed
    let chars_to_trim: &[char] = &[' ', ','];
    let trimmed_str: &str = string.trim_matches(chars_to_trim);
    println!("Used characters: {}", trimmed_str);

    // Heap allocate a string
    let alice = String::from("I like dogs");
    // Allocate new memory and store the modified string there
    let bob: String = alice.replace("dog", "cat");

    println!("Alice says: {}", alice);
    println!("Bob says: {}", bob);
}

More str/String methods can be found under the std::str and std::string modules

Literals and escapes

There are multiple ways to write string literals with special characters in them. All result in a similar &str so it's best to use the form that is the most convenient to write. Similarly there are multiple ways to write byte string literals, which all result in &[u8; N].

Generally special characters are escaped with a backslash character: \. This way you can add any character to your string, even unprintable ones and ones that you don't know how to type. If you want a literal backslash, escape it with another one: \\

String or character literal delimiters occuring within a literal must be escaped: "\"", '\''.

fn main() {
    // You can use escapes to write bytes by their hexadecimal values...
    let byte_escape = "I'm writing \x52\x75\x73\x74!";
    println!("What are you doing\x3F (\\x3F means ?) {}", byte_escape);

    // ...or Unicode code points.
    let unicode_codepoint = "\u{211D}";
    let character_name = "\"DOUBLE-STRUCK CAPITAL R\"";

    println!("Unicode character {} (U+211D) is called {}",
                unicode_codepoint, character_name );


    let long_string = "String literals
                        can span multiple lines.
                        The linebreak and indentation here ->\
                        <- can be escaped too!";
    println!("{}", long_string);
}

Sometimes there are just too many characters that need to be escaped or it's just much more convenient to write a string out as-is. This is where raw string literals come into play.

fn main() {
    let raw_str = r"Escapes don't work here: \x3F \u{211D}";
    println!("{}", raw_str);

    // If you need quotes in a raw string, add a pair of #s
    let quotes = r#"And then I said: "There is no escape!""#;
    println!("{}", quotes);

    // If you need "# in your string, just use more #s in the delimiter.
    // There is no limit for the number of #s you can use.
    let longer_delimiter = r###"A string with "# in it. And even "##!"###;
    println!("{}", longer_delimiter);
}

Want a string that's not UTF-8? (Remember, str and String must be valid UTF-8). Or maybe you want an array of bytes that's mostly text? Byte strings to the rescue!

use std::str;

fn main() {
    // Note that this is not actually a `&str`
    let bytestring: &[u8; 21] = b"this is a byte string";

    // Byte arrays don't have the `Display` trait, so printing them is a bit limited
    println!("A byte string: {:?}", bytestring);

    // Byte strings can have byte escapes...
    let escaped = b"\x52\x75\x73\x74 as bytes";
    // ...but no unicode escapes
    // let escaped = b"\u{211D} is not allowed";
    println!("Some escaped bytes: {:?}", escaped);


    // Raw byte strings work just like raw strings
    let raw_bytestring = br"\u{211D} is not escaped here";
    println!("{:?}", raw_bytestring);

    // Converting a byte array to `str` can fail
    if let Ok(my_str) = str::from_utf8(raw_bytestring) {
        println!("And the same as text: '{}'", my_str);
    }

    let _quotes = br#"You can also use "fancier" formatting, \
                    like with normal raw strings"#;

    // Byte strings don't have to be UTF-8
    let shift_jis = b"\x82\xe6\x82\xa8\x82\xb1\x82\xbb"; // "ようこそ" in SHIFT-JIS

    // But then they can't always be converted to `str`
    match str::from_utf8(shift_jis) {
        Ok(my_str) => println!("Conversion successful: '{}'", my_str),
        Err(e) => println!("Conversion failed: {:?}", e),
    };
}

For conversions between character encodings check out the encoding crate.

A more detailed listing of the ways to write string literals and escape characters is given in the 'Tokens' chapter of the Rust Reference.

Option

Sometimes it's desirable to catch the failure of some parts of a program instead of calling panic!; this can be accomplished using the Option enum.

The Option<T> enum has two variants:

  • None, to indicate failure or lack of value, and
  • Some(value), a tuple struct that wraps a value with type T.
// An integer division that doesn't `panic!`
fn checked_division(dividend: i32, divisor: i32) -> Option<i32> {
    if divisor == 0 {
        // Failure is represented as the `None` variant
        None
    } else {
        // Result is wrapped in a `Some` variant
        Some(dividend / divisor)
    }
}

// This function handles a division that may not succeed
fn try_division(dividend: i32, divisor: i32) {
    // `Option` values can be pattern matched, just like other enums
    match checked_division(dividend, divisor) {
        None => println!("{} / {} failed!", dividend, divisor),
        Some(quotient) => {
            println!("{} / {} = {}", dividend, divisor, quotient)
        },
    }
}

fn main() {
    try_division(4, 2);
    try_division(1, 0);

    // Binding `None` to a variable needs to be type annotated
    let none: Option<i32> = None;
    let _equivalent_none = None::<i32>;

    let optional_float = Some(0f32);

    // Unwrapping a `Some` variant will extract the value wrapped.
    println!("{:?} unwraps to {:?}", optional_float, optional_float.unwrap());

    // Unwrapping a `None` variant will `panic!`
    println!("{:?} unwraps to {:?}", none, none.unwrap());
}

Result

We've seen that the Option enum can be used as a return value from functions that may fail, where None can be returned to indicate failure. However, sometimes it is important to express why an operation failed. To do this we have the Result enum.

The Result<T, E> enum has two variants:

  • Ok(value) which indicates that the operation succeeded, and wraps the value returned by the operation. (value has type T)
  • Err(why), which indicates that the operation failed, and wraps why, which (hopefully) explains the cause of the failure. (why has type E)
mod checked {
    // Mathematical "errors" we want to catch
    #[derive(Debug)]
    pub enum MathError {
        DivisionByZero,
        NonPositiveLogarithm,
        NegativeSquareRoot,
    }

    pub type MathResult = Result<f64, MathError>;

    pub fn div(x: f64, y: f64) -> MathResult {
        if y == 0.0 {
            // This operation would `fail`, instead let's return the reason of
            // the failure wrapped in `Err`
            Err(MathError::DivisionByZero)
        } else {
            // This operation is valid, return the result wrapped in `Ok`
            Ok(x / y)
        }
    }

    pub fn sqrt(x: f64) -> MathResult {
        if x < 0.0 {
            Err(MathError::NegativeSquareRoot)
        } else {
            Ok(x.sqrt())
        }
    }

    pub fn ln(x: f64) -> MathResult {
        if x <= 0.0 {
            Err(MathError::NonPositiveLogarithm)
        } else {
            Ok(x.ln())
        }
    }
}

// `op(x, y)` === `sqrt(ln(x / y))`
fn op(x: f64, y: f64) -> f64 {
    // This is a three level match pyramid!
    match checked::div(x, y) {
        Err(why) => panic!("{:?}", why),
        Ok(ratio) => match checked::ln(ratio) {
            Err(why) => panic!("{:?}", why),
            Ok(ln) => match checked::sqrt(ln) {
                Err(why) => panic!("{:?}", why),
                Ok(sqrt) => sqrt,
            },
        },
    }
}

fn main() {
    // Will this fail?
    println!("{}", op(1.0, 10.0));
}

?

Chaining results using match can get pretty untidy; luckily, the ? operator can be used to make things pretty again. ? is used at the end of an expression returning a Result, and is equivalent to a match expression, where the Err(err) branch expands to an early Err(From::from(err)), and the Ok(ok) branch expands to an ok expression.

mod checked {
    #[derive(Debug)]
    enum MathError {
        DivisionByZero,
        NonPositiveLogarithm,
        NegativeSquareRoot,
    }

    type MathResult = Result<f64, MathError>;

    fn div(x: f64, y: f64) -> MathResult {
        if y == 0.0 {
            Err(MathError::DivisionByZero)
        } else {
            Ok(x / y)
        }
    }

    fn sqrt(x: f64) -> MathResult {
        if x < 0.0 {
            Err(MathError::NegativeSquareRoot)
        } else {
            Ok(x.sqrt())
        }
    }

    fn ln(x: f64) -> MathResult {
        if x <= 0.0 {
            Err(MathError::NonPositiveLogarithm)
        } else {
            Ok(x.ln())
        }
    }

    // Intermediate function
    fn op_(x: f64, y: f64) -> MathResult {
        // if `div` "fails", then `DivisionByZero` will be `return`ed
        let ratio = div(x, y)?;

        // if `ln` "fails", then `NonPositiveLogarithm` will be `return`ed
        let ln = ln(ratio)?;

        sqrt(ln)
    }

    pub fn op(x: f64, y: f64) {
        match op_(x, y) {
            Err(why) => panic!(match why {
                MathError::NonPositiveLogarithm
                    => "logarithm of non-positive number",
                MathError::DivisionByZero
                    => "division by zero",
                MathError::NegativeSquareRoot
                    => "square root of negative number",
            }),
            Ok(value) => println!("{}", value),
        }
    }
}

fn main() {
    checked::op(1.0, 10.0);
}

Be sure to check the documentation, as there are many methods to map/compose Result.

panic!

The panic! macro can be used to generate a panic and start unwinding its stack. While unwinding, the runtime will take care of freeing all the resources owned by the thread by calling the destructor of all its objects.

Since we are dealing with programs with only one thread, panic! will cause the program to report the panic message and exit.

// Re-implementation of integer division (/)
fn division(dividend: i32, divisor: i32) -> i32 {
    if divisor == 0 {
        // Division by zero triggers a panic
        panic!("division by zero");
    } else {
        dividend / divisor
    }
}

// The `main` task
fn main() {
    // Heap allocated integer
    let _x = Box::new(0i32);

    // This operation will trigger a task failure
    division(3, 0);

    println!("This point won't be reached!");

    // `_x` should get destroyed at this point
}

Let's check that panic! doesn't leak memory.

$ rustc panic.rs && valgrind ./panic
==4401== Memcheck, a memory error detector
==4401== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4401== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==4401== Command: ./panic
==4401== 
thread '<main>' panicked at 'division by zero', panic.rs:5
==4401== 
==4401== HEAP SUMMARY:
==4401==     in use at exit: 0 bytes in 0 blocks
==4401==   total heap usage: 18 allocs, 18 frees, 1,648 bytes allocated
==4401== 
==4401== All heap blocks were freed -- no leaks are possible
==4401== 
==4401== For counts of detected and suppressed errors, rerun with: -v
==4401== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

HashMap

Where vectors store values by an integer index, HashMaps store values by key. HashMap keys can be booleans, integers, strings, or any other type that implements the Eq and Hash traits. More on this in the next section.

Like vectors, HashMaps are growable, but HashMaps can also shrink themselves when they have excess space. You can create a HashMap with a certain starting capacity using HashMap::with_capacity(uint), or use HashMap::new() to get a HashMap with a default initial capacity (recommended).

use std::collections::HashMap;

fn call(number: &str) -> &str {
    match number {
        "798-1364" => "We're sorry, the call cannot be completed as dialed. 
            Please hang up and try again.",
        "645-7689" => "Hello, this is Mr. Awesome's Pizza. My name is Fred.
            What can I get for you today?",
        _ => "Hi! Who is this again?"
    }
}

fn main() { 
    let mut contacts = HashMap::new();

    contacts.insert("Daniel", "798-1364");
    contacts.insert("Ashley", "645-7689");
    contacts.insert("Katie", "435-8291");
    contacts.insert("Robert", "956-1745");

    // Takes a reference and returns Option<&V>
    match contacts.get(&"Daniel") {
        Some(&number) => println!("Calling Daniel: {}", call(number)),
        _ => println!("Don't have Daniel's number."),
    }

    // `HashMap::insert()` returns `None`
    // if the inserted value is new, `Some(value)` otherwise
    contacts.insert("Daniel", "164-6743");

    match contacts.get(&"Ashley") {
        Some(&number) => println!("Calling Ashley: {}", call(number)),
        _ => println!("Don't have Ashley's number."),
    }

    contacts.remove(&"Ashley"); 

    // `HashMap::iter()` returns an iterator that yields 
    // (&'a key, &'a value) pairs in arbitrary order.
    for (contact, &number) in contacts.iter() {
        println!("Calling {}: {}", contact, call(number)); 
    }
}

For more information on how hashing and hash maps (sometimes called hash tables) work, have a look at Hash Table Wikipedia

Alternate/custom key types

Any type that implements the Eq and Hash traits can be a key in HashMap. This includes:

  • bool (though not very useful since there is only two possible keys)
  • int, uint, and all variations thereof
  • String and &str (protip: you can have a HashMap keyed by String and call .get() with an &str)

Note that f32 and f64 do not implement Hash, likely because floating-point precision errors would make using them as hashmap keys horribly error-prone.

All collection classes implement Eq and Hash if their contained type also respectively implements Eq and Hash. For example, Vec<T> will implement Hash if T implements Hash.

You can easily implement Eq and Hash for a custom type with just one line: #[derive(PartialEq, Eq, Hash)]

The compiler will do the rest. If you want more control over the details, you can implement Eq and/or Hash yourself. This guide will not cover the specifics of implementing Hash.

To play around with using a struct in HashMap, let's try making a very simple user logon system:

use std::collections::HashMap;

// Eq requires that you derive PartialEq on the type.
#[derive(PartialEq, Eq, Hash)]
struct Account<'a>{
    username: &'a str,
    password: &'a str,
}

struct AccountInfo<'a>{
    name: &'a str,
    email: &'a str,
}

type Accounts<'a> = HashMap<Account<'a>, AccountInfo<'a>>;

fn try_logon<'a>(accounts: &Accounts<'a>,
        username: &'a str, password: &'a str){
    println!("Username: {}", username);
    println!("Password: {}", password);
    println!("Attempting logon...");

    let logon = Account {
        username,
        password,
    };

    match accounts.get(&logon) {
        Some(account_info) => {
            println!("Successful logon!");
            println!("Name: {}", account_info.name);
            println!("Email: {}", account_info.email);
        },
        _ => println!("Login failed!"),
    }
}

fn main(){
    let mut accounts: Accounts = HashMap::new();

    let account = Account {
        username: "j.everyman",
        password: "password123",
    };

    let account_info = AccountInfo {
        name: "John Everyman",
        email: "j.everyman@email.com",
    };

    accounts.insert(account, account_info);

    try_logon(&accounts, "j.everyman", "psasword123");

    try_logon(&accounts, "j.everyman", "password123");
}

HashSet

Consider a HashSet as a HashMap where we just care about the keys ( HashSet<T> is, in actuality, just a wrapper around HashMap<T, ()>).

"What's the point of that?" you ask. "I could just store the keys in a Vec."

A HashSet's unique feature is that it is guaranteed to not have duplicate elements. That's the contract that any set collection fulfills. HashSet is just one implementation. (see also: BTreeSet)

If you insert a value that is already present in the HashSet, (i.e. the new value is equal to the existing and they both have the same hash), then the new value will replace the old.

This is great for when you never want more than one of something, or when you want to know if you've already got something.

But sets can do more than that.

Sets have 4 primary operations (all of the following calls return an iterator):

  • union: get all the unique elements in both sets.

  • difference: get all the elements that are in the first set but not the second.

  • intersection: get all the elements that are only in both sets.

  • symmetric_difference: get all the elements that are in one set or the other, but not both.

Try all of these in the following example:

use std::collections::HashSet;

fn main() {
    let mut a: HashSet<i32> = vec![1i32, 2, 3].into_iter().collect();
    let mut b: HashSet<i32> = vec![2i32, 3, 4].into_iter().collect();

    assert!(a.insert(4));
    assert!(a.contains(&4));

    // `HashSet::insert()` returns false if
    // there was a value already present.
    assert!(b.insert(4), "Value 4 is already in set B!");
    // FIXME ^ Comment out this line

    b.insert(5);

    // If a collection's element type implements `Debug`,
    // then the collection implements `Debug`.
    // It usually prints its elements in the format `[elem1, elem2, ...]`
    println!("A: {:?}", a);
    println!("B: {:?}", b);

    // Print [1, 2, 3, 4, 5] in arbitrary order
    println!("Union: {:?}", a.union(&b).collect::<Vec<&i32>>());

    // This should print [1]
    println!("Difference: {:?}", a.difference(&b).collect::<Vec<&i32>>());

    // Print [2, 3, 4] in arbitrary order.
    println!("Intersection: {:?}", a.intersection(&b).collect::<Vec<&i32>>());

    // Print [1, 5]
    println!("Symmetric Difference: {:?}",
             a.symmetric_difference(&b).collect::<Vec<&i32>>());
}

(Examples are adapted from the documentation.)

Rc

When multiple ownership is needed, Rc(Reference Counting) can be used. Rc keeps track of the number of the references which means the number of owners of the value wrapped inside an Rc.

Reference count of an Rc increases by 1 whenever an Rc is cloned, and decreases by 1 whenever one cloned Rc is dropped out of the scope. When an Rc's reference count becomes zero, which means there are no owners remained, both the Rc and the value are all dropped.

Cloning an Rc never performs a deep copy. Cloning creates just another pointer to the wrapped value, and increments the count.

use std::rc::Rc;

fn main() {
    let rc_examples = "Rc examples".to_string();
    {
        println!("--- rc_a is created ---");
        
        let rc_a: Rc<String> = Rc::new(rc_examples);
        println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a));
        
        {
            println!("--- rc_a is cloned to rc_b ---");
            
            let rc_b: Rc<String> = Rc::clone(&rc_a);
            println!("Reference Count of rc_b: {}", Rc::strong_count(&rc_b));
            println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a));
            
            // Two `Rc`s are equal if their inner values are equal
            println!("rc_a and rc_b are equal: {}", rc_a.eq(&rc_b));
            
            // We can use methods of a value directly
            println!("Length of the value inside rc_a: {}", rc_a.len());
            println!("Value of rc_b: {}", rc_b);
            
            println!("--- rc_b is dropped out of scope ---");
        }
        
        println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a));
        
        println!("--- rc_a is dropped out of scope ---");
    }
    
    // Error! `rc_examples` already moved into `rc_a`
    // And when `rc_a` is dropped, `rc_examples` is dropped together
    // println!("rc_examples: {}", rc_examples);
    // TODO ^ Try uncommenting this line
}

See also:

std::rc and std::sync::arc.

Arc

When shared ownership between threads is needed, Arc(Atomic Reference Counted) can be used. This struct, via the Clone implementation can create a reference pointer for the location of a value in the memory heap while increasing the reference counter. As it shares ownership between threads, when the last reference pointer to a value is out of scope, the variable is dropped.


fn main() {
use std::sync::Arc;
use std::thread;

// This variable declaration is where it's value is specified.
let apple = Arc::new("the same apple");

for _ in 0..10 {
    // Here there is no value specification as it is a pointer to a reference
    // in the memory heap.
    let apple = Arc::clone(&apple);

    thread::spawn(move || {
        // As Arc was used, threads can be spawned using the value allocated
        // in the Arc variable pointer's location.
        println!("{:?}", apple);
    });
}
}

Std misc

Many other types are provided by the std library to support things such as:

  • Threads
  • Channels
  • File I/O

These expand beyond what the primitives provide.

See also:

primitives and the std library

Threads

Rust provides a mechanism for spawning native OS threads via the spawn function, the argument of this function is a moving closure.

use std::thread;

const NTHREADS: u32 = 10;

// This is the `main` thread
fn main() {
    // Make a vector to hold the children which are spawned.
    let mut children = vec![];

    for i in 0..NTHREADS {
        // Spin up another thread
        children.push(thread::spawn(move || {
            println!("this is thread number {}", i);
        }));
    }

    for child in children {
        // Wait for the thread to finish. Returns a result.
        let _ = child.join();
    }
}

These threads will be scheduled by the OS.

Testcase: map-reduce

Rust makes it very easy to parallelise data processing, without many of the headaches traditionally associated with such an attempt.

The standard library provides great threading primitives out of the box. These, combined with Rust's concept of Ownership and aliasing rules, automatically prevent data races.

The aliasing rules (one writable reference XOR many readable references) automatically prevent you from manipulating state that is visible to other threads. (Where synchronisation is needed, there are synchronisation primitives like Mutexes or Channels.)

In this example, we will calculate the sum of all digits in a block of numbers. We will do this by parcelling out chunks of the block into different threads. Each thread will sum its tiny block of digits, and subsequently we will sum the intermediate sums produced by each thread.

Note that, although we're passing references across thread boundaries, Rust understands that we're only passing read-only references, and that thus no unsafety or data races can occur. Because we're move-ing the data segments into the thread, Rust will also ensure the data is kept alive until the threads exit, so no dangling pointers occur.

use std::thread;

// This is the `main` thread
fn main() {

    // This is our data to process.
    // We will calculate the sum of all digits via a threaded  map-reduce algorithm.
    // Each whitespace separated chunk will be handled in a different thread.
    //
    // TODO: see what happens to the output if you insert spaces!
    let data = "86967897737416471853297327050364959
11861322575564723963297542624962850
70856234701860851907960690014725639
38397966707106094172783238747669219
52380795257888236525459303330302837
58495327135744041048897885734297812
69920216438980873548808413720956532
16278424637452589860345374828574668";

    // Make a vector to hold the child-threads which we will spawn.
    let mut children = vec![];

    /*************************************************************************
     * "Map" phase
     *
     * Divide our data into segments, and apply initial processing
     ************************************************************************/

    // split our data into segments for individual calculation
    // each chunk will be a reference (&str) into the actual data
    let chunked_data = data.split_whitespace();

    // Iterate over the data segments.
    // .enumerate() adds the current loop index to whatever is iterated
    // the resulting tuple "(index, element)" is then immediately
    // "destructured" into two variables, "i" and "data_segment" with a
    // "destructuring assignment"
    for (i, data_segment) in chunked_data.enumerate() {
        println!("data segment {} is \"{}\"", i, data_segment);

        // Process each data segment in a separate thread
        //
        // spawn() returns a handle to the new thread,
        // which we MUST keep to access the returned value
        //
        // 'move || -> u32' is syntax for a closure that:
        // * takes no arguments ('||')
        // * takes ownership of its captured variables ('move') and
        // * returns an unsigned 32-bit integer ('-> u32')
        //
        // Rust is smart enough to infer the '-> u32' from
        // the closure itself so we could have left that out.
        //
        // TODO: try removing the 'move' and see what happens
        children.push(thread::spawn(move || -> u32 {
            // Calculate the intermediate sum of this segment:
            let result = data_segment
                        // iterate over the characters of our segment..
                        .chars()
                        // .. convert text-characters to their number value..
                        .map(|c| c.to_digit(10).expect("should be a digit"))
                        // .. and sum the resulting iterator of numbers
                        .sum();

            // println! locks stdout, so no text-interleaving occurs
            println!("processed segment {}, result={}", i, result);

            // "return" not needed, because Rust is an "expression language", the
            // last evaluated expression in each block is automatically its value.
            result

        }));
    }


    /*************************************************************************
     * "Reduce" phase
     *
     * Collect our intermediate results, and combine them into a final result
     ************************************************************************/

    // collect each thread's intermediate results into a new Vec
    let mut intermediate_sums = vec![];
    for child in children {
        // collect each child thread's return-value
        let intermediate_sum = child.join().unwrap();
        intermediate_sums.push(intermediate_sum);
    }

    // combine all intermediate sums into a single final sum.
    //
    // we use the "turbofish" ::<> to provide sum() with a type hint.
    //
    // TODO: try without the turbofish, by instead explicitly
    // specifying the type of final_result
    let final_result = intermediate_sums.iter().sum::<u32>();

    println!("Final sum result: {}", final_result);
}

Assignments

It is not wise to let our number of threads depend on user inputted data. What if the user decides to insert a lot of spaces? Do we really want to spawn 2,000 threads? Modify the program so that the data is always chunked into a limited number of chunks, defined by a static constant at the beginning of the program.

See also:

Channels

Rust provides asynchronous channels for communication between threads. Channels allow a unidirectional flow of information between two end-points: the Sender and the Receiver.

use std::sync::mpsc::{Sender, Receiver};
use std::sync::mpsc;
use std::thread;

static NTHREADS: i32 = 3;

fn main() {
    // Channels have two endpoints: the `Sender<T>` and the `Receiver<T>`,
    // where `T` is the type of the message to be transferred
    // (type annotation is superfluous)
    let (tx, rx): (Sender<i32>, Receiver<i32>) = mpsc::channel();
    let mut children = Vec::new();

    for id in 0..NTHREADS {
        // The sender endpoint can be copied
        let thread_tx = tx.clone();

        // Each thread will send its id via the channel
        let child = thread::spawn(move || {
            // The thread takes ownership over `thread_tx`
            // Each thread queues a message in the channel
            thread_tx.send(id).unwrap();

            // Sending is a non-blocking operation, the thread will continue
            // immediately after sending its message
            println!("thread {} finished", id);
        });

        children.push(child);
    }

    // Here, all the messages are collected
    let mut ids = Vec::with_capacity(NTHREADS as usize);
    for _ in 0..NTHREADS {
        // The `recv` method picks a message from the channel
        // `recv` will block the current thread if there are no messages available
        ids.push(rx.recv());
    }
    
    // Wait for the threads to complete any remaining work
    for child in children {
        child.join().expect("oops! the child thread panicked");
    }

    // Show the order in which the messages were sent
    println!("{:?}", ids);
}

Path

The Path struct represents file paths in the underlying filesystem. There are two flavors of Path: posix::Path, for UNIX-like systems, and windows::Path, for Windows. The prelude exports the appropriate platform-specific Path variant.

A Path can be created from an OsStr, and provides several methods to get information from the file/directory the path points to.

Note that a Path is not internally represented as an UTF-8 string, but instead is stored as a vector of bytes (Vec<u8>). Therefore, converting a Path to a &str is not free and may fail (an Option is returned).

use std::path::Path;

fn main() {
    // Create a `Path` from an `&'static str`
    let path = Path::new(".");

    // The `display` method returns a `Show`able structure
    let _display = path.display();

    // `join` merges a path with a byte container using the OS specific
    // separator, and returns the new path
    let new_path = path.join("a").join("b");

    // Convert the path into a string slice
    match new_path.to_str() {
        None => panic!("new path is not a valid UTF-8 sequence"),
        Some(s) => println!("new path is {}", s),
    }
}

Be sure to check at other Path methods (posix::Path or windows::Path) and the Metadata struct.

See also:

OsStr and Metadata.

File I/O

The File struct represents a file that has been opened (it wraps a file descriptor), and gives read and/or write access to the underlying file.

Since many things can go wrong when doing file I/O, all the File methods return the io::Result<T> type, which is an alias for Result<T, io::Error>.

This makes the failure of all I/O operations explicit. Thanks to this, the programmer can see all the failure paths, and is encouraged to handle them in a proactive manner.

open

The open static method can be used to open a file in read-only mode.

A File owns a resource, the file descriptor and takes care of closing the file when it is droped.

use std::fs::File;
use std::io::prelude::*;
use std::path::Path;

fn main() {
    // Create a path to the desired file
    let path = Path::new("hello.txt");
    let display = path.display();

    // Open the path in read-only mode, returns `io::Result<File>`
    let mut file = match File::open(&path) {
        Err(why) => panic!("couldn't open {}: {}", display, why),
        Ok(file) => file,
    };

    // Read the file contents into a string, returns `io::Result<usize>`
    let mut s = String::new();
    match file.read_to_string(&mut s) {
        Err(why) => panic!("couldn't read {}: {}", display, why),
        Ok(_) => print!("{} contains:\n{}", display, s),
    }

    // `file` goes out of scope, and the "hello.txt" file gets closed
}

Here's the expected successful output:

$ echo "Hello World!" > hello.txt
$ rustc open.rs && ./open
hello.txt contains:
Hello World!

(You are encouraged to test the previous example under different failure conditions: hello.txt doesn't exist, or hello.txt is not readable, etc.)

create

The create static method opens a file in write-only mode. If the file already existed, the old content is destroyed. Otherwise, a new file is created.

static LOREM_IPSUM: &str =
    "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
";

use std::fs::File;
use std::io::prelude::*;
use std::path::Path;

fn main() {
    let path = Path::new("lorem_ipsum.txt");
    let display = path.display();

    // Open a file in write-only mode, returns `io::Result<File>`
    let mut file = match File::create(&path) {
        Err(why) => panic!("couldn't create {}: {}", display, why),
        Ok(file) => file,
    };

    // Write the `LOREM_IPSUM` string to `file`, returns `io::Result<()>`
    match file.write_all(LOREM_IPSUM.as_bytes()) {
        Err(why) => panic!("couldn't write to {}: {}", display, why),
        Ok(_) => println!("successfully wrote to {}", display),
    }
}

Here's the expected successful output:

$ rustc create.rs && ./create
successfully wrote to lorem_ipsum.txt
$ cat lorem_ipsum.txt
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

(As in the previous example, you are encouraged to test this example under failure conditions.)

There is OpenOptions struct that can be used to configure how a file is opened.

read_lines

The method lines() returns an iterator over the lines of a file.

File::open expects a generic, AsRef<Path>. That's what read_lines() expects as input.

use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;

fn main() {
    // File hosts must exist in current path before this produces output
    if let Ok(lines) = read_lines("./hosts") {
        // Consumes the iterator, returns an (Optional) String
        for line in lines {
            if let Ok(ip) = line {
                println!("{}", ip);
            }
        }
    }
}

// The output is wrapped in a Result to allow matching on errors
// Returns an Iterator to the Reader of the lines of the file.
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
    let file = File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

Running this program simply prints the lines individually.

$ echo -e "127.0.0.1\n192.168.0.1\n" > hosts
$ rustc read_lines.rs && ./read_lines
127.0.0.1
192.168.0.1

This process is more efficient than creating a String in memory especially working with larger files.

Child processes

The process::Output struct represents the output of a finished child process, and the process::Command struct is a process builder.

use std::process::Command;

fn main() {
    let output = Command::new("rustc")
        .arg("--version")
        .output().unwrap_or_else(|e| {
            panic!("failed to execute process: {}", e)
    });

    if output.status.success() {
        let s = String::from_utf8_lossy(&output.stdout);

        print!("rustc succeeded and stdout was:\n{}", s);
    } else {
        let s = String::from_utf8_lossy(&output.stderr);

        print!("rustc failed and stderr was:\n{}", s);
    }
}

(You are encouraged to try the previous example with an incorrect flag passed to rustc)

Pipes

The std::Child struct represents a running child process, and exposes the stdin, stdout and stderr handles for interaction with the underlying process via pipes.

use std::io::prelude::*;
use std::process::{Command, Stdio};

static PANGRAM: &'static str =
"the quick brown fox jumped over the lazy dog\n";

fn main() {
    // Spawn the `wc` command
    let process = match Command::new("wc")
                                .stdin(Stdio::piped())
                                .stdout(Stdio::piped())
                                .spawn() {
        Err(why) => panic!("couldn't spawn wc: {}", why),
        Ok(process) => process,
    };

    // Write a string to the `stdin` of `wc`.
    //
    // `stdin` has type `Option<ChildStdin>`, but since we know this instance
    // must have one, we can directly `unwrap` it.
    match process.stdin.unwrap().write_all(PANGRAM.as_bytes()) {
        Err(why) => panic!("couldn't write to wc stdin: {}", why),
        Ok(_) => println!("sent pangram to wc"),
    }

    // Because `stdin` does not live after the above calls, it is `drop`ed,
    // and the pipe is closed.
    //
    // This is very important, otherwise `wc` wouldn't start processing the
    // input we just sent.

    // The `stdout` field also has type `Option<ChildStdout>` so must be unwrapped.
    let mut s = String::new();
    match process.stdout.unwrap().read_to_string(&mut s) {
        Err(why) => panic!("couldn't read wc stdout: {}", why),
        Ok(_) => print!("wc responded with:\n{}", s),
    }
}

Wait

If you'd like to wait for a process::Child to finish, you must call Child::wait, which will return a process::ExitStatus.

use std::process::Command;

fn main() {
    let mut child = Command::new("sleep").arg("5").spawn().unwrap();
    let _result = child.wait().unwrap();

    println!("reached end of main");
}
$ rustc wait.rs && ./wait
# `wait` keeps running for 5 seconds until the `sleep 5` command finishes
reached end of main

Filesystem Operations

The std::fs module contains several functions that deal with the filesystem.

use std::fs;
use std::fs::{File, OpenOptions};
use std::io;
use std::io::prelude::*;
use std::os::unix;
use std::path::Path;

// A simple implementation of `% cat path`
fn cat(path: &Path) -> io::Result<String> {
    let mut f = File::open(path)?;
    let mut s = String::new();
    match f.read_to_string(&mut s) {
        Ok(_) => Ok(s),
        Err(e) => Err(e),
    }
}

// A simple implementation of `% echo s > path`
fn echo(s: &str, path: &Path) -> io::Result<()> {
    let mut f = File::create(path)?;

    f.write_all(s.as_bytes())
}

// A simple implementation of `% touch path` (ignores existing files)
fn touch(path: &Path) -> io::Result<()> {
    match OpenOptions::new().create(true).write(true).open(path) {
        Ok(_) => Ok(()),
        Err(e) => Err(e),
    }
}

fn main() {
    println!("`mkdir a`");
    // Create a directory, returns `io::Result<()>`
    match fs::create_dir("a") {
        Err(why) => println!("! {:?}", why.kind()),
        Ok(_) => {},
    }

    println!("`echo hello > a/b.txt`");
    // The previous match can be simplified using the `unwrap_or_else` method
    echo("hello", &Path::new("a/b.txt")).unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`mkdir -p a/c/d`");
    // Recursively create a directory, returns `io::Result<()>`
    fs::create_dir_all("a/c/d").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`touch a/c/e.txt`");
    touch(&Path::new("a/c/e.txt")).unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`ln -s ../b.txt a/c/b.txt`");
    // Create a symbolic link, returns `io::Result<()>`
    if cfg!(target_family = "unix") {
        unix::fs::symlink("../b.txt", "a/c/b.txt").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
        });
    }

    println!("`cat a/c/b.txt`");
    match cat(&Path::new("a/c/b.txt")) {
        Err(why) => println!("! {:?}", why.kind()),
        Ok(s) => println!("> {}", s),
    }

    println!("`ls a`");
    // Read the contents of a directory, returns `io::Result<Vec<Path>>`
    match fs::read_dir("a") {
        Err(why) => println!("! {:?}", why.kind()),
        Ok(paths) => for path in paths {
            println!("> {:?}", path.unwrap().path());
        },
    }

    println!("`rm a/c/e.txt`");
    // Remove a file, returns `io::Result<()>`
    fs::remove_file("a/c/e.txt").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`rmdir a/c/d`");
    // Remove an empty directory, returns `io::Result<()>`
    fs::remove_dir("a/c/d").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });
}

Here's the expected successful output:

$ rustc fs.rs && ./fs
`mkdir a`
`echo hello > a/b.txt`
`mkdir -p a/c/d`
`touch a/c/e.txt`
`ln -s ../b.txt a/c/b.txt`
`cat a/c/b.txt`
> hello
`ls a`
> "a/b.txt"
> "a/c"
`rm a/c/e.txt`
`rmdir a/c/d`

And the final state of the a directory is:

$ tree a
a
|-- b.txt
`-- c
    `-- b.txt -> ../b.txt

1 directory, 2 files

An alternative way to define the function cat is with ? notation:

fn cat(path: &Path) -> io::Result<String> {
    let mut f = File::open(path)?;
    let mut s = String::new();
    f.read_to_string(&mut s)?;
    Ok(s)
}

See also:

cfg!

Program arguments

Standard Library

The command line arguments can be accessed using std::env::args, which returns an iterator that yields a String for each argument:

use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();

    // The first argument is the path that was used to call the program.
    println!("My path is {}.", args[0]);

    // The rest of the arguments are the passed command line parameters.
    // Call the program like this:
    //   $ ./args arg1 arg2
    println!("I got {:?} arguments: {:?}.", args.len() - 1, &args[1..]);
}
$ ./args 1 2 3
My path is ./args.
I got 3 arguments: ["1", "2", "3"].

Crates

Alternatively, there are numerous crates that can provide extra functionality when creating command-line applications. The Rust Cookbook exhibits best practices on how to use one of the more popular command line argument crates, clap.

Argument parsing

Matching can be used to parse simple arguments:

use std::env;

fn increase(number: i32) {
    println!("{}", number + 1);
}

fn decrease(number: i32) {
    println!("{}", number - 1);
}

fn help() {
    println!("usage:
match_args <string>
    Check whether given string is the answer.
match_args {{increase|decrease}} <integer>
    Increase or decrease given integer by one.");
}

fn main() {
    let args: Vec<String> = env::args().collect();

    match args.len() {
        // no arguments passed
        1 => {
            println!("My name is 'match_args'. Try passing some arguments!");
        },
        // one argument passed
        2 => {
            match args[1].parse() {
                Ok(42) => println!("This is the answer!"),
                _ => println!("This is not the answer."),
            }
        },
        // one command and one argument passed
        3 => {
            let cmd = &args[1];
            let num = &args[2];
            // parse the number
            let number: i32 = match num.parse() {
                Ok(n) => {
                    n
                },
                Err(_) => {
                    eprintln!("error: second argument not an integer");
                    help();
                    return;
                },
            };
            // parse the command
            match &cmd[..] {
                "increase" => increase(number),
                "decrease" => decrease(number),
                _ => {
                    eprintln!("error: invalid command");
                    help();
                },
            }
        },
        // all the other cases
        _ => {
            // show a help message
            help();
        }
    }
}
$ ./match_args Rust
This is not the answer.
$ ./match_args 42
This is the answer!
$ ./match_args do something
error: second argument not an integer
usage:
match_args <string>
    Check whether given string is the answer.
match_args {increase|decrease} <integer>
    Increase or decrease given integer by one.
$ ./match_args do 42
error: invalid command
usage:
match_args <string>
    Check whether given string is the answer.
match_args {increase|decrease} <integer>
    Increase or decrease given integer by one.
$ ./match_args increase 42
43

Foreign Function Interface

Rust provides a Foreign Function Interface (FFI) to C libraries. Foreign functions must be declared inside an extern block annotated with a #[link] attribute containing the name of the foreign library.

use std::fmt;

// this extern block links to the libm library
#[link(name = "m")]
extern {
    // this is a foreign function
    // that computes the square root of a single precision complex number
    fn csqrtf(z: Complex) -> Complex;

    fn ccosf(z: Complex) -> Complex;
}

// Since calling foreign functions is considered unsafe,
// it's common to write safe wrappers around them.
fn cos(z: Complex) -> Complex {
    unsafe { ccosf(z) }
}

fn main() {
    // z = -1 + 0i
    let z = Complex { re: -1., im: 0. };

    // calling a foreign function is an unsafe operation
    let z_sqrt = unsafe { csqrtf(z) };

    println!("the square root of {:?} is {:?}", z, z_sqrt);

    // calling safe API wrapped around unsafe operation
    println!("cos({:?}) = {:?}", z, cos(z));
}

// Minimal implementation of single precision complex numbers
#[repr(C)]
#[derive(Clone, Copy)]
struct Complex {
    re: f32,
    im: f32,
}

impl fmt::Debug for Complex {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        if self.im < 0. {
            write!(f, "{}-{}i", self.re, -self.im)
        } else {
            write!(f, "{}+{}i", self.re, self.im)
        }
    }
}

Testing

Rust is a programming language that cares a lot about correctness and it includes support for writing software tests within the language itself.

Testing comes in three styles:

Also Rust has support for specifying additional dependencies for tests:

See Also

Unit testing

Tests are Rust functions that verify that the non-test code is functioning in the expected manner. The bodies of test functions typically perform some setup, run the code we want to test, then assert whether the results are what we expect.

Most unit tests go into a tests mod with the #[cfg(test)] attribute. Test functions are marked with the #[test] attribute.

Tests fail when something in the test function panics. There are some helper macros:

  • assert!(expression) - panics if expression evaluates to false.
  • assert_eq!(left, right) and assert_ne!(left, right) - testing left and right expressions for equality and inequality respectively.
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

// This is a really bad adding function, its purpose is to fail in this
// example.
#[allow(dead_code)]
fn bad_add(a: i32, b: i32) -> i32 {
    a - b
}

#[cfg(test)]
mod tests {
    // Note this useful idiom: importing names from outer (for mod tests) scope.
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(1, 2), 3);
    }

    #[test]
    fn test_bad_add() {
        // This assert would fire and test will fail.
        // Please note, that private functions can be tested too!
        assert_eq!(bad_add(1, 2), 3);
    }
}

Tests can be run with cargo test.

$ cargo test

running 2 tests
test tests::test_bad_add ... FAILED
test tests::test_add ... ok

failures:

---- tests::test_bad_add stdout ----
        thread 'tests::test_bad_add' panicked at 'assertion failed: `(left == right)`
  left: `-1`,
 right: `3`', src/lib.rs:21:8
note: Run with `RUST_BACKTRACE=1` for a backtrace.


failures:
    tests::test_bad_add

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

Tests and ?

None of the previous unit test examples had a return type. But in Rust 2018, your unit tests can return Result<()>, which lets you use ? in them! This can make them much more concise.

fn sqrt(number: f64) -> Result<f64, String> {
    if number >= 0.0 {
        Ok(number.powf(0.5))
    } else {
        Err("negative floats don't have square roots".to_owned())
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_sqrt() -> Result<(), String> {
        let x = 4.0;
        assert_eq!(sqrt(x)?.powf(2.0), x);
        Ok(())
    }
}

See "The Edition Guide" for more details.

Testing panics

To check functions that should panic under certain circumstances, use attribute #[should_panic]. This attribute accepts optional parameter expected = with the text of the panic message. If your function can panic in multiple ways, it helps make sure your test is testing the correct panic.

pub fn divide_non_zero_result(a: u32, b: u32) -> u32 {
    if b == 0 {
        panic!("Divide-by-zero error");
    } else if a < b {
        panic!("Divide result is zero");
    }
    a / b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_divide() {
        assert_eq!(divide_non_zero_result(10, 2), 5);
    }

    #[test]
    #[should_panic]
    fn test_any_panic() {
        divide_non_zero_result(1, 0);
    }

    #[test]
    #[should_panic(expected = "Divide result is zero")]
    fn test_specific_panic() {
        divide_non_zero_result(1, 10);
    }
}

Running these tests gives us:

$ cargo test

running 3 tests
test tests::test_any_panic ... ok
test tests::test_divide ... ok
test tests::test_specific_panic ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests tmp-test-should-panic

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Running specific tests

To run specific tests one may specify the test name to cargo test command.

$ cargo test test_any_panic
running 1 test
test tests::test_any_panic ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out

   Doc-tests tmp-test-should-panic

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

To run multiple tests one may specify part of a test name that matches all the tests that should be run.

$ cargo test panic
running 2 tests
test tests::test_any_panic ... ok
test tests::test_specific_panic ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out

   Doc-tests tmp-test-should-panic

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Ignoring tests

Tests can be marked with the #[ignore] attribute to exclude some tests. Or to run them with command cargo test -- --ignored

#![allow(unused)]
fn main() {
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(2, 2), 4);
    }

    #[test]
    fn test_add_hundred() {
        assert_eq!(add(100, 2), 102);
        assert_eq!(add(2, 100), 102);
    }

    #[test]
    #[ignore]
    fn ignored_test() {
        assert_eq!(add(0, 0), 0);
    }
}
}
$ cargo test
running 3 tests
test tests::ignored_test ... ignored
test tests::test_add ... ok
test tests::test_add_hundred ... ok

test result: ok. 2 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out

   Doc-tests tmp-ignore

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

$ cargo test -- --ignored
running 1 test
test tests::ignored_test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests tmp-ignore

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Documentation testing

The primary way of documenting a Rust project is through annotating the source code. Documentation comments are written in markdown and support code blocks in them. Rust takes care about correctness, so these code blocks are compiled and used as tests.

/// First line is a short summary describing function.
///
/// The next lines present detailed documentation. Code blocks start with
/// triple backquotes and have implicit `fn main()` inside
/// and `extern crate <cratename>`. Assume we're testing `doccomments` crate:
///
/// ```
/// let result = doccomments::add(2, 3);
/// assert_eq!(result, 5);
/// ```
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

/// Usually doc comments may include sections "Examples", "Panics" and "Failures".
///
/// The next function divides two numbers.
///
/// # Examples
///
/// ```
/// let result = doccomments::div(10, 2);
/// assert_eq!(result, 5);
/// ```
///
/// # Panics
///
/// The function panics if the second argument is zero.
///
/// ```rust,should_panic
/// // panics on division by zero
/// doccomments::div(10, 0);
/// ```
pub fn div(a: i32, b: i32) -> i32 {
    if b == 0 {
        panic!("Divide-by-zero error");
    }

    a / b
}

Tests can be run with cargo test:

$ cargo test
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests doccomments

running 3 tests
test src/lib.rs - add (line 7) ... ok
test src/lib.rs - div (line 21) ... ok
test src/lib.rs - div (line 31) ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Motivation behind documentation tests

The main purpose of documentation tests is to serve as examples that exercise the functionality, which is one of the most important guidelines. It allows using examples from docs as complete code snippets. But using ? makes compilation fail since main returns unit. The ability to hide some source lines from documentation comes to the rescue: one may write fn try_main() -> Result<(), ErrorType>, hide it and unwrap it in hidden main. Sounds complicated? Here's an example:

/// Using hidden `try_main` in doc tests.
///
/// ```
/// # // hidden lines start with `#` symbol, but they're still compileable!
/// # fn try_main() -> Result<(), String> { // line that wraps the body shown in doc
/// let res = try::try_div(10, 2)?;
/// # Ok(()) // returning from try_main
/// # }
/// # fn main() { // starting main that'll unwrap()
/// #    try_main().unwrap(); // calling try_main and unwrapping
/// #                         // so that test will panic in case of error
/// # }
/// ```
pub fn try_div(a: i32, b: i32) -> Result<i32, String> {
    if b == 0 {
        Err(String::from("Divide-by-zero"))
    } else {
        Ok(a / b)
    }
}

See Also

Integration testing

Unit tests are testing one module in isolation at a time: they're small and can test private code. Integration tests are external to your crate and use only its public interface in the same way any other code would. Their purpose is to test that many parts of your library work correctly together.

Cargo looks for integration tests in tests directory next to src.

File src/lib.rs:

// Define this in a crate called `adder`.
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

File with test: tests/integration_test.rs:

#[test]
fn test_add() {
    assert_eq!(adder::add(3, 2), 5);
}

Running tests with cargo test command:

$ cargo test
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/debug/deps/integration_test-bcd60824f5fbfe19

running 1 test
test test_add ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests adder

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Each Rust source file in tests directory is compiled as a separate crate. One way of sharing some code between integration tests is making module with public functions, importing and using it within tests.

File tests/common.rs:

pub fn setup() {
    // some setup code, like creating required files/directories, starting
    // servers, etc.
}

File with test: tests/integration_test.rs

// importing common module.
mod common;

#[test]
fn test_add() {
    // using common code.
    common::setup();
    assert_eq!(adder::add(3, 2), 5);
}

Modules with common code follow the ordinary modules rules, so it's ok to create common module as tests/common/mod.rs.

Development dependencies

Sometimes there is a need to have dependencies for tests (or examples, or benchmarks) only. Such dependencies are added to Cargo.toml in the [dev-dependencies] section. These dependencies are not propagated to other packages which depend on this package.

One such example is using a crate that extends standard assert! macros.
File Cargo.toml:

# standard crate data is left out
[dev-dependencies]
pretty_assertions = "0.4.0"

File src/lib.rs:

// externing crate for test-only use
#[cfg(test)]
#[macro_use]
extern crate pretty_assertions;

pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(2, 3), 5);
    }
}

See Also

Cargo docs on specifying dependencies.

Unsafe Operations

As an introduction to this section, to borrow from the official docs, "one should try to minimize the amount of unsafe code in a code base." With that in mind, let's get started! Unsafe annotations in Rust are used to bypass protections put in place by the compiler; specifically, there are four primary things that unsafe is used for:

  • dereferencing raw pointers
  • calling functions or methods which are unsafe (including calling a function over FFI, see a previous chapter of the book)
  • accessing or modifying static mutable variables
  • implementing unsafe traits

Raw Pointers

Raw pointers * and references &T function similarly, but references are always safe because they are guaranteed to point to valid data due to the borrow checker. Dereferencing a raw pointer can only be done through an unsafe block.

fn main() {
    let raw_p: *const u32 = &10;

    unsafe {
        assert!(*raw_p == 10);
    }
}

Calling Unsafe Functions

Some functions can be declared as unsafe, meaning it is the programmer's responsibility to ensure correctness instead of the compiler's. One example of this is std::slice::from_raw_parts which will create a slice given a pointer to the first element and a length.

use std::slice;

fn main() {
    let some_vector = vec![1, 2, 3, 4];

    let pointer = some_vector.as_ptr();
    let length = some_vector.len();

    unsafe {
        let my_slice: &[u32] = slice::from_raw_parts(pointer, length);

        assert_eq!(some_vector.as_slice(), my_slice);
    }
}

For slice::from_raw_parts, one of the assumptions which must be upheld is that the pointer passed in points to valid memory and that the memory pointed to is of the correct type. If these invariants aren't upheld then the program's behaviour is undefined and there is no knowing what will happen.

인라인 어셈블리

Rust provides support for inline assembly via the asm! macro. It can be used to embed handwritten assembly in the assembly output generated by the compiler. Generally this should not be necessary, but might be where the required performance or timing cannot be otherwise achieved. Accessing low level hardware primitives, e.g. in kernel code, may also demand this functionality.

Note: the examples here are given in x86/x86-64 assembly, but other architectures are also supported.

Inline assembly is currently supported on the following architectures:

  • x86 and x86-64
  • ARM
  • AArch64
  • RISC-V

Basic usage

Let us start with the simplest possible example:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

unsafe {
    asm!("nop");
}
}
}

This will insert a NOP (no operation) instruction into the assembly generated by the compiler. Note that all asm! invocations have to be inside an unsafe block, as they could insert arbitrary instructions and break various invariants. The instructions to be inserted are listed in the first argument of the asm! macro as a string literal.

Inputs and outputs

Now inserting an instruction that does nothing is rather boring. Let us do something that actually acts on data:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let x: u64;
unsafe {
    asm!("mov {}, 5", out(reg) x);
}
assert_eq!(x, 5);
}
}

This will write the value 5 into the u64 variable x. You can see that the string literal we use to specify instructions is actually a template string. It is governed by the same rules as Rust format strings. The arguments that are inserted into the template however look a bit different than you may be familiar with. First we need to specify if the variable is an input or an output of the inline assembly. In this case it is an output. We declared this by writing out. We also need to specify in what kind of register the assembly expects the variable. In this case we put it in an arbitrary general purpose register by specifying reg. The compiler will choose an appropriate register to insert into the template and will read the variable from there after the inline assembly finishes executing.

Let us see another example that also uses an input:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let i: u64 = 3;
let o: u64;
unsafe {
    asm!(
        "mov {0}, {1}",
        "add {0}, 5",
        out(reg) o,
        in(reg) i,
    );
}
assert_eq!(o, 8);
}
}

This will add 5 to the input in variable i and write the result to variable o. The particular way this assembly does this is first copying the value from i to the output, and then adding 5 to it.

The example shows a few things:

First, we can see that asm! allows multiple template string arguments; each one is treated as a separate line of assembly code, as if they were all joined together with newlines between them. This makes it easy to format assembly code.

Second, we can see that inputs are declared by writing in instead of out.

Third, we can see that we can specify an argument number, or name as in any format string. For inline assembly templates this is particularly useful as arguments are often used more than once. For more complex inline assembly using this facility is generally recommended, as it improves readability, and allows reordering instructions without changing the argument order.

We can further refine the above example to avoid the mov instruction:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut x: u64 = 3;
unsafe {
    asm!("add {0}, 5", inout(reg) x);
}
assert_eq!(x, 8);
}
}

We can see that inout is used to specify an argument that is both input and output. This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register.

It is also possible to specify different variables for the input and output parts of an inout operand:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let x: u64 = 3;
let y: u64;
unsafe {
    asm!("add {0}, 5", inout(reg) x => y);
}
assert_eq!(y, 8);
}
}

Late output operands

The Rust compiler is conservative with its allocation of operands. It is assumed that an out can be written at any time, and can therefore not share its location with any other argument. However, to guarantee optimal performance it is important to use as few registers as possible, so they won't have to be saved and reloaded around the inline assembly block. To achieve this Rust provides a lateout specifier. This can be used on any output that is written only after all inputs have been consumed. There is also an inlateout variant of this specifier.

Here is an example where inlateout cannot be used in release mode or other optimized cases:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
let c: u64 = 4;
unsafe {
    asm!(
        "add {0}, {1}",
        "add {0}, {2}",
        inout(reg) a,
        in(reg) b,
        in(reg) c,
    );
}
assert_eq!(a, 12);
}
}

In unoptimized cases (e.g. Debug mode), replacing inout(reg) a with inlateout(reg) a in the above example can continue to give the expected result. However, with release mode or other optimized cases, using inlateout(reg) a can instead lead to the final value a = 16, causing the assertion to fail.

This is because in optimized cases, the compiler is free to allocate the same register for inputs b and c since it knows that they have the same value. Furthermore, when inlateout is used, a and c could be allocated to the same register, in which case the first add instruction would overwrite the initial load from variable c. This is in contrast to how using inout(reg) a ensures a separate register is allocated for a.

However, the following example can use inlateout since the output is only modified after all input registers have been read:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
unsafe {
    asm!("add {0}, {1}", inlateout(reg) a, in(reg) b);
}
assert_eq!(a, 8);
}
}

As you can see, this assembly fragment will still work correctly if a and b are assigned to the same register.

Explicit register operands

Some instructions require that the operands be in a specific register. Therefore, Rust inline assembly provides some more specific constraint specifiers. While reg is generally available on any architecture, explicit registers are highly architecture specific. E.g. for x86 the general purpose registers eax, ebx, ecx, edx, ebp, esi, and edi among others can be addressed by their name.

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let cmd = 0xd1;
unsafe {
    asm!("out 0x64, eax", in("eax") cmd);
}
}
}

In this example we call the out instruction to output the content of the cmd variable to port 0x64. Since the out instruction only accepts eax (and its sub registers) as operand we had to use the eax constraint specifier.

Note: unlike other operand types, explicit register operands cannot be used in the template string: you can't use {} and should write the register name directly instead. Also, they must appear at the end of the operand list after all other operand types.

Consider this example which uses the x86 mul instruction:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

fn mul(a: u64, b: u64) -> u128 {
    let lo: u64;
    let hi: u64;

    unsafe {
        asm!(
            // The x86 mul instruction takes rax as an implicit input and writes
            // the 128-bit result of the multiplication to rax:rdx.
            "mul {}",
            in(reg) a,
            inlateout("rax") b => lo,
            lateout("rdx") hi
        );
    }

    ((hi as u128) << 64) + lo as u128
}
}
}

This uses the mul instruction to multiply two 64-bit inputs with a 128-bit result. The only explicit operand is a register, that we fill from the variable a. The second operand is implicit, and must be the rax register, which we fill from the variable b. The lower 64 bits of the result are stored in rax from which we fill the variable lo. The higher 64 bits are stored in rdx from which we fill the variable hi.

Clobbered registers

In many cases inline assembly will modify state that is not needed as an output. Usually this is either because we have to use a scratch register in the assembly or because instructions modify state that we don't need to further examine. This state is generally referred to as being "clobbered". We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block.

use std::arch::asm;

#[cfg(target_arch = "x86_64")]
fn main() {
    // three entries of four bytes each
    let mut name_buf = [0_u8; 12];
    // String is stored as ascii in ebx, edx, ecx in order
    // Because ebx is reserved, the asm needs to preserve the value of it.
    // So we push and pop it around the main asm.
    // 64 bit mode on 64 bit processors does not allow pushing/popping of
    // 32 bit registers (like ebx), so we have to use the extended rbx register instead.

    unsafe {
        asm!(
            "push rbx",
            "cpuid",
            "mov [rdi], ebx",
            "mov [rdi + 4], edx",
            "mov [rdi + 8], ecx",
            "pop rbx",
            // We use a pointer to an array for storing the values to simplify
            // the Rust code at the cost of a couple more asm instructions
            // This is more explicit with how the asm works however, as opposed
            // to explicit register outputs such as `out("ecx") val`
            // The *pointer itself* is only an input even though it's written behind
            in("rdi") name_buf.as_mut_ptr(),
            // select cpuid 0, also specify eax as clobbered
            inout("eax") 0 => _,
            // cpuid clobbers these registers too
            out("ecx") _,
            out("edx") _,
        );
    }

    let name = core::str::from_utf8(&name_buf).unwrap();
    println!("CPU Manufacturer ID: {}", name);
}

#[cfg(not(target_arch = "x86_64"))]
fn main() {}

In the example above we use the cpuid instruction to read the CPU manufacturer ID. This instruction writes to eax with the maximum supported cpuid argument and ebx, edx, and ecx with the CPU manufacturer ID as ASCII bytes in that order.

Even though eax is never read we still need to tell the compiler that the register has been modified so that the compiler can save any values that were in these registers before the asm. This is done by declaring it as an output but with _ instead of a variable name, which indicates that the output value is to be discarded.

This code also works around the limitation that ebx is a reserved register by LLVM. That means that LLVM assumes that it has full control over the register and it must be restored to its original state before exiting the asm block, so it cannot be used as an input or output except if the compiler uses it to fulfill a general register class (e.g. in(reg)). This makes reg operands dangerous when using reserved registers as we could unknowingly corrupt our input or output because they share the same register.

To work around this we use rdi to store the pointer to the output array, save ebx via push, read from ebx inside the asm block into the array and then restore ebx to its original state via pop. The push and pop use the full 64-bit rbx version of the register to ensure that the entire register is saved. On 32 bit targets the code would instead use ebx in the push/pop.

This can also be used with a general register class to obtain a scratch register for use inside the asm code:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

// Multiply x by 6 using shifts and adds
let mut x: u64 = 4;
unsafe {
    asm!(
        "mov {tmp}, {x}",
        "shl {tmp}, 1",
        "shl {x}, 2",
        "add {x}, {tmp}",
        x = inout(reg) x,
        tmp = out(reg) _,
    );
}
assert_eq!(x, 4 * 6);
}
}

Symbol operands and ABI clobbers

By default, asm! assumes that any register not specified as an output will have its contents preserved by the assembly code. The clobber_abi argument to asm! tells the compiler to automatically insert the necessary clobber operands according to the given calling convention ABI: any register which is not fully preserved in that ABI will be treated as clobbered. Multiple clobber_abi arguments may be provided and all clobbers from all specified ABIs will be inserted.

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

extern "C" fn foo(arg: i32) -> i32 {
    println!("arg = {}", arg);
    arg * 2
}

fn call_foo(arg: i32) -> i32 {
    unsafe {
        let result;
        asm!(
            "call {}",
            // Function pointer to call
            in(reg) foo,
            // 1st argument in rdi
            in("rdi") arg,
            // Return value in rax
            out("rax") result,
            // Mark all registers which are not preserved by the "C" calling
            // convention as clobbered.
            clobber_abi("C"),
        );
        result
    }
}
}
}

Register template modifiers

In some cases, fine control is needed over the way a register name is formatted when inserted into the template string. This is needed when an architecture's assembly language has several names for the same register, each typically being a "view" over a subset of the register (e.g. the low 32 bits of a 64-bit register).

By default the compiler will always choose the name that refers to the full register size (e.g. rax on x86-64, eax on x86, etc).

This default can be overridden by using modifiers on the template string operands, just like you would with format strings:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut x: u16 = 0xab;

unsafe {
    asm!("mov {0:h}, {0:l}", inout(reg_abcd) x);
}

assert_eq!(x, 0xabab);
}
}

In this example, we use the reg_abcd register class to restrict the register allocator to the 4 legacy x86 registers (ax, bx, cx, dx) of which the first two bytes can be addressed independently.

Let us assume that the register allocator has chosen to allocate x in the ax register. The h modifier will emit the register name for the high byte of that register and the l modifier will emit the register name for the low byte. The asm code will therefore be expanded as mov ah, al which copies the low byte of the value into the high byte.

If you use a smaller data type (e.g. u16) with an operand and forget to use template modifiers, the compiler will emit a warning and suggest the correct modifier to use.

Memory address operands

Sometimes assembly instructions require operands passed via memory addresses/memory locations. You have to manually use the memory address syntax specified by the target architecture. For example, on x86/x86_64 using Intel assembly syntax, you should wrap inputs/outputs in [] to indicate they are memory operands:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

fn load_fpu_control_word(control: u16) {
    unsafe {
        asm!("fldcw [{}]", in(reg) &control, options(nostack));
    }
}
}
}

Labels

Any reuse of a named label, local or otherwise, can result in an assembler or linker error or may cause other strange behavior. Reuse of a named label can happen in a variety of ways including:

  • explicitly: using a label more than once in one asm! block, or multiple times across blocks.
  • implicitly via inlining: the compiler is allowed to instantiate multiple copies of an asm! block, for example when the function containing it is inlined in multiple places.
  • implicitly via LTO: LTO can cause code from other crates to be placed in the same codegen unit, and so could bring in arbitrary labels.

As a consequence, you should only use GNU assembler numeric local labels inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions.

Moreover, on x86 when using the default Intel syntax, due to an LLVM bug, you shouldn't use labels exclusively made of 0 and 1 digits, e.g. 0, 11 or 101010, as they may end up being interpreted as binary values. Using options(att_syntax) will avoid any ambiguity, but that affects the syntax of the entire asm! block. (See Options, below, for more on options.)

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a = 0;
unsafe {
    asm!(
        "mov {0}, 10",
        "2:",
        "sub {0}, 1",
        "cmp {0}, 3",
        "jle 2f",
        "jmp 2b",
        "2:",
        "add {0}, 2",
        out(reg) a
    );
}
assert_eq!(a, 5);
}
}

This will decrement the {0} register value from 10 to 3, then add 2 and store it in a.

This example shows a few things:

  • First, that the same number can be used as a label multiple times in the same inline block.
  • Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes “b” (“backward”) or ”f” (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction.

Options

By default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc. However, in many cases it is desirable to give the compiler more information about what the assembly code is actually doing so that it can optimize better.

Let's take our previous example of an add instruction:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
unsafe {
    asm!(
        "add {0}, {1}",
        inlateout(reg) a, in(reg) b,
        options(pure, nomem, nostack),
    );
}
assert_eq!(a, 8);
}
}

Options can be provided as an optional final argument to the asm! macro. We specified three options here:

  • pure means that the asm code has no observable side effects and that its output depends only on its inputs. This allows the compiler optimizer to call the inline asm fewer times or even eliminate it entirely.
  • nomem means that the asm code does not read or write to memory. By default the compiler will assume that inline assembly can read or write any memory address that is accessible to it (e.g. through a pointer passed as an operand, or a global).
  • nostack means that the asm code does not push any data onto the stack. This allows the compiler to use optimizations such as the stack red zone on x86-64 to avoid stack pointer adjustments.

These allow the compiler to better optimize code using asm!, for example by eliminating pure asm! blocks whose outputs are not needed.

See the reference for the full list of available options and their effects.

Compatibility

The Rust language is fastly evolving, and because of this certain compatibility issues can arise, despite efforts to ensure forwards-compatibility wherever possible.

원본 식별자

러스트에는 다른 프로그래밍 언어와 같이 "키워드"라는 컨셉이 존재합니다. 이 식별자는 언어 수준에서 특정한 의미를 가지고, 이에 따라 변수명이나 함수명 등등에 보통은 사용할 수 없습니다. 원본 식별자(r#)를 이용하면 보통이라면 이용할 수 없는 키워드를 식별자로 이용할 수 있게 해줍니다. 이 방식은 러스트에서 새로운 키워드를 발표함에 따라 이전 에디션의 러스트에서 사용한 변수/함수 식별자가 최신 에디션에서 키워드로 지정되었고 이를 활용해야 할 때 유용합니다.

예를 들어, 크레이트 foo가 러스트 2015 에디션에서 컴파일되었고 try라는 이름의 함수를 노출한다고 해봅시다. 이 키워드는 2018 에디션에서 키워드로 지정되었기 때문에, 원본 식별자 문법 없이는 그 함수를 더이상 활용할 수 없게 됩니다.

extern crate foo;

fn main() {
    foo::try();
}

이를 컴파일하면 이런 오류가 발생합니다:

error: expected identifier, found keyword `try`
 --> src/main.rs:4:4
  |
4 | foo::try();
  |      ^^^ expected identifier, found keyword

이를 해결하기 위해 원본 식별자를 이용해 작성하면 이렇게 됩니다:

extern crate foo;

fn main() {
    foo::r#try();
}

Meta

Some topics aren't exactly relevant to how you program but provide you tooling or infrastructure support which just makes things better for everyone. These topics include:

  • Documentation: Generate library documentation for users via the included rustdoc.
  • Playpen: Integrate the Rust Playpen(also known as the Rust Playground) in your documentation.

Documentation

Use cargo doc to build documentation in target/doc.

Use cargo test to run all tests (including documentation tests), and cargo test --doc to only run documentation tests.

These commands will appropriately invoke rustdoc (and rustc) as required.

Doc comments

Doc comments are very useful for big projects that require documentation. When running rustdoc, these are the comments that get compiled into documentation. They are denoted by a ///, and support Markdown.

#![crate_name = "doc"]

/// A human being is represented here
pub struct Person {
    /// A person must have a name, no matter how much Juliet may hate it
    name: String,
}

impl Person {
    /// Returns a person with the name given them
    ///
    /// # Arguments
    ///
    /// * `name` - A string slice that holds the name of the person
    ///
    /// # Examples
    ///
    /// ```
    /// // You can have rust code between fences inside the comments
    /// // If you pass --test to `rustdoc`, it will even test it for you!
    /// use doc::Person;
    /// let person = Person::new("name");
    /// ```
    pub fn new(name: &str) -> Person {
        Person {
            name: name.to_string(),
        }
    }

    /// Gives a friendly hello!
    ///
    /// Says "Hello, [name]" to the `Person` it is called on.
    pub fn hello(& self) {
        println!("Hello, {}!", self.name);
    }
}

fn main() {
    let john = Person::new("John");

    john.hello();
}

To run the tests, first build the code as a library, then tell rustdoc where to find the library so it can link it into each doctest program:

$ rustc doc.rs --crate-type lib
$ rustdoc --test --extern doc="libdoc.rlib" doc.rs

Doc attributes

Below are a few examples of the most common #[doc] attributes used with rustdoc.

inline

Used to inline docs, instead of linking out to separate page.

#[doc(inline)]
pub use bar::Bar;

/// bar docs
mod bar {
    /// the docs for Bar
    pub struct Bar;
}

no_inline

Used to prevent linking out to separate page or anywhere.

// Example from libcore/prelude
#[doc(no_inline)]
pub use crate::mem::drop;

hidden

Using this tells rustdoc not to include this in documentation:

// Example from the futures-rs library
#[doc(hidden)]
pub use self::async_await::*;

For documentation, rustdoc is widely used by the community. It's what is used to generate the std library docs.

See also:

Playpen

The Rust Playpen is a way to experiment with Rust code through a web interface. This project is now commonly referred to as Rust Playground.

Using it with mdbook

In mdbook, you can make code examples playable and editable.

fn main() {
    println!("Hello World!");
}

This allows the reader to both run your code sample, but also modify and tweak it. The key here is the adding the word editable to your codefence block separated by a comma.

```rust,editable
//...place your code here
```

Additionally, you can add ignore if you want mdbook to skip your code when it builds and tests.

```rust,editable,ignore
//...place your code here
```

Using it with docs

You may have noticed in some of the official Rust docs a button that says "Run", which opens the code sample up in a new tab in Rust Playground. This feature is enabled if you use the #[doc] attribute called html_playground_url.

See also: