osos.api_github.api_github.Github

class Github(owner, repo, token=None)[source]

Bases: object

Class to call github api and return osos-formatted usage data.

Parameters:
  • owner (str) – Repository owner, e.g. https://github.com/{owner}/{repo}

  • repo (str) – Repository name, e.g. https://github.com/{owner}/{repo}

  • token (str | None) – Github api authorization token. If none this gets retrieved from the GITHUB_TOKEN environment variable

Methods

clones(**kwargs)

Get the daily github repo clone data for the last two weeks.

commit_count(**kwargs)

Get the number of repo commits

commits([date_start, date_iter, search_all])

Get the number of commits by day in a given set of dates.

contributors(**kwargs)

Get the number of repo contributors

forks(**kwargs)

Get the number of repo forks.

get_generator(request, **kwargs)

Call the github API using the requests.get() method and merge all the paginated results into a single output

get_issues_pulls([option, state, get_lifetimes])

Get open/closed issues/pulls for the repo (all have the same general parsing format)

get_request(request, **kwargs)

Get the raw request output object

issues_closed([get_lifetimes])

Get data on the closed repo issues.

issues_open([get_lifetimes])

Get data on the open repo issues.

pulls_closed([get_lifetimes])

Get data on the closed repo pull requests.

pulls_open([get_lifetimes])

Get data on the open repo pull requests.

stargazers(**kwargs)

Get the number of repo stargazers

subscribers(**kwargs)

Get the number of repo subscribers

views(**kwargs)

Get the daily github repo views data for the last two weeks.

Attributes

BASE_REQ

TIME_FORMAT

get_issues_pulls(option='issues', state='open', get_lifetimes=False, **kwargs)[source]

Get open/closed issues/pulls for the repo (all have the same general parsing format)

Parameters:
  • option (str) – “issues” or “pulls”

  • state (str) – “open” or “closed”

  • get_lifetimes (bool) – Flag to get the lifetime statistics of issues/pulls. Default is false to reduce number of API queries. Turning this on requires that we get the full data for every issue/pull. It is recommended that users retrieve lifetime statistics manually when desired and not as part of an automated OSOS workflow.

  • kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int | dict) – Integer count of the number of issues/pulls if get_lifetimes=False, or a dict Namespace with keys: “{option}_{state}” and “{option}_{state}_*” for count, lifteimtes, and mean/median lifetime in days

get_request(request, **kwargs)[source]

Get the raw request output object

Parameters:
Returns:

out (requests.models.Response) – requests.get() output object.

get_generator(request, **kwargs)[source]

Call the github API using the requests.get() method and merge all the paginated results into a single output

Parameters:
Returns:

out (generator) – generator of list items in the request output

contributors(**kwargs)[source]

Get the number of repo contributors

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int) – Number of contributors for the repo.

commit_count(**kwargs)[source]

Get the number of repo commits

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int) – Total number of commits to the repo.

commits(date_start=None, date_iter=None, search_all=False, **kwargs)[source]

Get the number of commits by day in a given set of dates.

Parameters:
  • date_start (datetime.date | None) – Option to search for commits from this date to today. Either input this or the date_iter.

  • date_iter (list | tuple | pd.DatetimeIndex | None) – Iterable of dates to search for. Either input this or the date_start.

  • search_all (bool) – Flag to search all commits or to terminate early (default) when the commit date is before all dates in the date_iter

  • kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (pd.DataFrame) – Timeseries of commit data based on date_iter as the index. Includes columns for “commits”.

clones(**kwargs)[source]

Get the daily github repo clone data for the last two weeks.

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (pd.DataFrame) – Timeseries of daily git clone data. Includes columns for “clones” and “clones_unique”. Index is a pandas datetime index with just the datetime.date part.

forks(**kwargs)[source]

Get the number of repo forks.

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int) – The number of forks.

issues_closed(get_lifetimes=False, **kwargs)[source]

Get data on the closed repo issues.

Parameters:
  • get_lifetimes (bool) – Flag to get the lifetime statistics of issues/pulls. Default is false to reduce number of API queries. Turning this on requires that we get the full data for every issue/pull. It is recommended that users retrieve lifetime statistics manually when desired and not as part of an automated OSOS workflow.

  • kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int | dict) – Number of closed issues, or if get_lifetimes is True, this returns a dict with additional metrics.

issues_open(get_lifetimes=False, **kwargs)[source]

Get data on the open repo issues.

Parameters:
  • get_lifetimes (bool) – Flag to get the lifetime statistics of issues/pulls. Default is false to reduce number of API queries. Turning this on requires that we get the full data for every issue/pull. It is recommended that users retrieve lifetime statistics manually when desired and not as part of an automated OSOS workflow.

  • kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int | dict) – Number of open issues, or if get_lifetimes is True, this returns a dict with additional metrics.

pulls_closed(get_lifetimes=False, **kwargs)[source]

Get data on the closed repo pull requests.

Parameters:
  • get_lifetimes (bool) – Flag to get the lifetime statistics of issues/pulls. Default is false to reduce number of API queries. Turning this on requires that we get the full data for every issue/pull. It is recommended that users retrieve lifetime statistics manually when desired and not as part of an automated OSOS workflow.

  • kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int | dict) – Number of closed pull requests, or if get_lifetimes is True, this returns a dict with additional metrics.

pulls_open(get_lifetimes=False, **kwargs)[source]

Get data on the open repo pull requests.

Parameters:
  • get_lifetimes (bool) – Flag to get the lifetime statistics of issues/pulls. Default is false to reduce number of API queries. Turning this on requires that we get the full data for every issue/pull. It is recommended that users retrieve lifetime statistics manually when desired and not as part of an automated OSOS workflow.

  • kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int | dict) – Number of open pull requests, or if get_lifetimes is True, this returns a dict with additional metrics.

stargazers(**kwargs)[source]

Get the number of repo stargazers

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int) – Number of stargazers for the repo.

subscribers(**kwargs)[source]

Get the number of repo subscribers

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (int) – Number of subscribers for the repo.

views(**kwargs)[source]

Get the daily github repo views data for the last two weeks.

Parameters:

kwargs (dict) – Optional kwargs to get passed to requests.get()

Returns:

out (pd.DataFrame) – Timeseries of daily git views data. Includes columns for “views” and “views_unique”. Index is a pandas datetime index with just the datetime.date part.