Commit graph

22 commits

Author SHA1 Message Date
Neil Alexander 07d0e72a8b
Improve roomserver logging 2022-01-31 15:33:00 +00:00
Neil Alexander a763cbb0e1
Roomserver/federation input refactor (#2104)
* Put federation client functions into their own file

* Look for missing auth events in RS input

* Remove retrieveMissingAuthEvents from federation API

* Logging

* Sorta transplanted the code over

* Use event origin failing all else

* Don't get stuck on mutexes:

* Add verifier

* Don't mark state events with zero snapshot NID as not existing

* Check missing state if not an outlier before storing the event

* Reject instead of soft-fail, don't copy roominfo so much

* Use synchronous contexts, limit time to fetch missing events

* Clean up some commented out bits

* Simplify `/send` endpoint significantly

* Submit async

* Report errors on sending to RS input

* Set max payload in NATS to 16MB

* Tweak metrics

* Add `workerForRoom` for tidiness

* Try skipping unmarshalling errors for RespMissingEvents

* Track missing prev events separately to avoid calculating state when not possible

* Tweak logic around checking missing state

* Care about state when checking missing prev events

* Don't check missing state for create events

* Try that again

* Handle create events better

* Send create room events as new

* Use given event kind when sending auth/state events

* Revert "Use given event kind when sending auth/state events"

This reverts commit 089d64d271.

* Only search for missing prev events or state for new events

* Tweaks

* We only have missing prev if we don't supply state

* Room version tweaks

* Allow async inputs again

* Apply backpressure to consumers/synchronous requests to hopefully stop things being overwhelmed

* Set timeouts on roomserver input tasks (need to decide what timeout makes sense)

* Use work queue policy, deliver all on restart

* Reduce chance of duplicates being sent by NATS

* Limit the number of servers we attempt to reduce backpressure

* Some review comment fixes

* Tidy up a couple things

* Don't limit servers, randomise order using map

* Some context refactoring

* Update gmsl

* Don't resend create events

* Set stateIDs length correctly or else the roomserver thinks there are missing events when there aren't

* Exclude our own servername

* Try backing off servers

* Make excluding self behaviour optional

* Exclude self from g_m_e

* Update sytest-whitelist

* Update consumers for the roomserver output stream

* Remember to send outliers for state returned from /gme

* Make full HTTP tests less upsetti

* Remove 'If a device list update goes missing, the server resyncs on the next one' from the sytest blacklist

* Remove debugging test

* Fix blacklist again, remove unnecessary duplicate context

* Clearer contexts, don't use background in case there's something happening there

* Don't queue up events more than once in memory

* Correctly identify create events when checking for state

* Fill in gaps again in /gme code

* Remove `AuthEventIDs` from `InputRoomEvent`

* Remove stray field

Co-authored-by: Kegan Dougal <kegan@matrix.org>
2022-01-27 14:29:14 +00:00
S7evinK 161f145176
Add NATS JetStream support (#1866)
* Add NATS JetStream support
Update shopify/sarama

* Fix addresses

* Don't change Addresses in Defaults

* Update saramajetstream

* Add missing error check

Keep typing events for at least one minute

* Use all configured NATS addresses

* Update saramajetstream

* Try setting up with NATS

* Make sure NATS uses own persistent directory (TODO: make this configurable)

* Update go.mod/go.sum

* Jetstream package

* Various other refactoring

* Build fixes

* Config tweaks, make random jetstream storage path for CI

* Disable interest policies

* Try to sane default on jetstream base path

* Try to use in-memory for CI

* Restore storage/retention

* Update nats.go dependency

* Adapt changes to config

* Remove unneeded TopicFor

* Dep update

* Revert "Remove unneeded TopicFor"

This reverts commit f5a4e4a339.

* Revert changes made to streams

* Fix build problems

* Update nats-server

* Update go.mod/go.sum

* Roomserver input API queuing using NATS

* Fix topic naming

* Prometheus metrics

* More refactoring to remove saramajetstream

* Add missing topic

* Don't try to populate map that doesn't exist

* Roomserver output topic

* Update go.mod/go.sum

* Message acknowledgements

* Ack tweaks

* Try to resume transaction re-sends

* Try to resume transaction re-sends

* Update to matrix-org/gomatrixserverlib@91dadfb

* Remove internal.PartitionStorer from components that don't consume keychanges

* Try to reduce re-allocations a bit in resolveConflictsV2

* Tweak delivery options on RS input

* Publish send-to-device messages into correct JetStream subject

* Async and sync roomserver input

* Update dendrite-config.yaml

* Remove roomserver tests for now (they need rewriting)

* Remove roomserver test again (was merged back in)

* Update documentation

* Docker updates

* More Docker updates

* Update Docker readme again

* Fix lint issues

* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

* Go 1.16 instead of Go 1.13 for upgrade tests and Complement

* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

This reverts commit 368675283f.

* Don't report any errors on `/send` to see what fun that creates

* Fix panics on closed channel sends

* Enforce state key matches sender

* Do the same for leave

* Various tweaks to make tests happier

Squashed commit of the following:

commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:47:14 2022 +0000

    Do the same for leave

commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:33:42 2022 +0000

    Enforce state key matches sender

commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 14:07:04 2022 +0000

    Fix panics on closed channel sends

commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:38:22 2022 +0000

    Don't report any errors on `/send` to see what fun that creates

commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:00:26 2022 +0000

    Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

    This reverts commit 368675283f.

commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 12:09:34 2022 +0000

    Go 1.16 instead of Go 1.13 for upgrade tests and Complement

commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 11:51:45 2022 +0000

    Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 10:29:08 2022 +0000

    Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork

* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers

* Fix consumer component name in  federation API

* Add comment explaining where streams are defined

* Tweaks to roomserver input with comments

* Finish that sentence that I apparently forgot to finish in INSTALL.md

* Bump version number of config to 2

* Add comments around asynchronous sends to roomserver in processEventWithMissingState

* More useful error message when the config version does not match

* Set version in generate-config

* Fix version in config.Defaults

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-01-05 17:44:49 +00:00
Neil Alexander 20a01bceb2
Pass pointers to events — reloaded (#1583)
* Pass events as pointers

* Fix lint errors

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update to matrix-org/gomatrixserverlib#240
2020-11-16 15:44:53 +00:00
Neil Alexander 6e63df1d9a
KindOld (#1531)
* Add KindOld

* Don't process latest events/memberships for old events

* Allow federationsender to ignore duplicate key entries when LatestEventIDs is duplicated by RS output events

* Signal to downstream components if an event has become a forward extremity

* Don't exclude from sync

* Soft-fail checks on KindNew

* Don't run the latest events updater at all for KindOld

* Don't make federation sender change after all

* Kind in federation sender join

* Don't send isForwardExtremity

* Fix syncapi

* Update comments

* Fix SendEventWithState

* Update sytest-whitelist

* Generate old output events

* Sync API consumes old room events

* Update comments
2020-10-19 14:59:13 +01:00
Neil Alexander d821f9d3c9
Deep checking of forward extremities (#1491)
* Deep forward extremity calculation

* Use updater txn

* Update error

* Update error

* Create previous event references in StoreEvent

* Use latest events updater to row-lock prev events

* Fix unexpected fallthrough

* Fix deadlock

* Don't roll back

* Update comments in calculateLatest

* Don't include events that we can't find references for in the forward extremities

* Add another passing test
2020-10-07 14:05:33 +01:00
Neil Alexander bf90db5b60
Remove KindRewrite (#1481)
* Don't send rewrite events

* Remove final traces of rewrite events

* Remove test that is no longer needed

* Revert "Remove test that is no longer needed"

This reverts commit 9a45babff6.

* Update test to use KindOutlier
2020-10-06 11:05:00 +01:00
Kegsay 18231f25b4
Implement rejected events (#1426)
* WIP Event rejection

* Still send back errors for rejected events

Instead, discard them at the federationapi /send layer rather than
re-implementing checks at the clientapi/PerformJoin layer.

* Implement rejected events

Critically, rejected events CAN cause state resolution to happen
as it can merge forks in the DAG. This is fine, _provided_ we
do not add the rejected event when performing state resolution,
which is what this PR does. It also fixes the error handling
when NotAllowed happens, as we were checking too early and needlessly
handling NotAllowed in more than one place.

* Update test to match reality

* Modify InputRoomEvents to no longer return an error

Errors do not serialise across HTTP boundaries in polylith mode,
so instead set fields on the InputRoomEventsResponse. Add `Err()`
function to make the API shape basically the same.

* Remove redundant returns; linting

* Update blacklist
2020-09-16 13:00:52 +01:00
Neil Alexander 965f068d1a
Handle state with input event as new events (#1415)
* SendEventWithState events as new

* Use cumulative state IDs for final event

* Error wrapping in calculateAndSetState

* Handle overwriting same event type and state key

* Hacky way to spot historical events

* Don't exclude from sync

* Don't generate output events when rewriting forward extremities

* Update output event check

* Historical output events

* Define output room event type

* Notify key changes on state

* Don't send our membership event twice

* Deduplicate state entries

* Tweaks

* Remove unnecessary nolint

* Fix current state upsert in sync API

* Send auth events as outliers, state events as rewrite

* Sync API don't consume state events

* Process events actually

* Improve outlier check

* Fix local room check

* Remove extra room check, it seems to break the whole damn world

* Fix federated join check

* Fix nil pointer exception

* Better comments on DeduplicateStateEntries

* Reflow forced federated joins

* Don't force federated join for possibly even local invites

* Comment SendEventWithState better

* Rewrite room state in sync API storage

* Add TODO

* Clean up all room data when receiving create event

* Don't generate output events for rewrites, but instead notify that state is rewritten on the final new event

* Rename to PurgeRoom

* Exclude backfilled messages from /sync

* Split out rewriting state from updating state from state res

Co-authored-by: Kegan Dougal <kegan@matrix.org>
2020-09-15 11:17:46 +01:00
Neil Alexander 6150de6cb3
FIFO ordering of input events (#1386)
* Initial FIFOing of roomserver inputs

* Remove EventID response from api.InputRoomEventsResponse

* Don't send back event ID unnecessarily

* Fix ordering hopefully

* Reduce copies, use buffered task channel to reduce contention on other rooms

* Fix error handling
2020-09-03 15:22:16 +01:00
Kegsay 002fe05a20
Add PerformInvite and refactor how errors get handled (#1158)
* Add PerformInvite and refactor how errors get handled

- Rename `JoinError` to `PerformError`
- Remove `error` from the API function signature entirely. This forces
  errors to be bundled into `PerformError` which makes it easier for callers
  to detect and handle errors. On network errors, HTTP clients will make a
  `PerformError`.

* Unbreak everything; thanks Go!

* Send back JSONResponse according to the PerformError

* Update federation invite code too
2020-06-24 15:06:14 +01:00
Kegsay 9834ac97db
Convert everything but serverkeyapi to inthttp (#1096)
* Convert roomserver to new inthttp format

* Convert eduserver to new inthttp format

* Convert appservice to new inthttp format
2020-06-04 15:43:07 +01:00
Neil Alexander fe82e1f725
Separate muxes for public and internal APIs (#1056)
* Separate muxes for public and internal APIs

* Update client-api-proxy and federation-api-proxy so they don't add /api to the path

* Tidy up

* Consistent HTTP setup

* Set up prefixes properly
2020-05-22 11:43:17 +01:00
Kegsay 24d8df664c
Fix #897 and shuffle directory around (#1054)
* Fix #897 and shuffle directory around

* Update find-lint

* goimports

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-05-21 14:40:13 +01:00
Neil Alexander e15f6676ac
Consolidation of roomserver APIs (#994)
* Consolidation of roomserver APIs

* Comment out alias tests for now, they are broken

* Wire AS API into roomserver again

* Roomserver didn't take asAPI param before so return to that

* Prevent roomserver asking AS API for alias info

* Rename some files

* Remove alias_test, incoherent tests and unwanted appservice integration

* Remove FS API inject on syncapi component
2020-05-01 10:48:17 +01:00
Neil Alexander a308e61331
Federation sender API remodel (#988)
* Define an input API for the federationsender

* Wiring for rooomserver input API and federation sender input API

* Whoops, commit common too

* Merge input API into query API

* Rename FederationSenderQueryAPI to FederationSenderInternalAPI

* Fix dendritejs

* Rename Input to Perform

* Fix a couple of inputs -> performs

* Remove needless storage interface, add comments
2020-04-29 11:34:31 +01:00
Neil Alexander 3ab8ebf6b8
More invite support (#979)
* Update gomatixserverlib

* Try to build invite stripped state if not given to us

* SendInvite improvements

* Transpose invite_room_state into invite_state.events for sync API

* Remove syncapi debugging output

* Use RespInviteV2

* Update gomatrixserverlib

* Send the invite event as a normal roomserver event too, for incorporating into room (should this be done by the roomserver automatically for invite inputs?)

* Federation sender use invite_room_state, room server try to insert membership state

* Check supported room versions on the invite endpoint

* Prevent roomserver query API from trying to handle requests for stub rooms

* Adding a nolint

* Replace IsRoomStub with RoomNIDExcludingStubs, fix query API to use that instead

* Review comments
2020-04-24 16:30:25 +01:00
Neil Alexander 067b875063
Invites v2 endpoint (#952)
* Start converting v1 invite endpoint to v2

* Update gomatrixserverlib

* Early federationsender code for sending invites

* Sending invites sorta happens now

* Populate invite request with stripped state

* Remodel a bit, don't reflect received invites

* Handle invite_room_state

* Handle room versions a bit better

* Update gomatrixserverlib

* Tweak order in destinationQueue.next

* Revert check in processMessage

* Tweak federation sender destination queue code a bit

* Add comments
2020-04-03 14:29:06 +01:00
Ben B 955244c092
use custom http client instead of the http DefaultClient (#823)
This commit replaces the default client from the http lib with a custom one.
The previously used default client doesn't come with a timeout. This could cause
unwanted locks.
That solution chosen here creates a http client in the base component dendrite
with a constant timeout of 30 seconds. If it should be necessary to overwrite
this, we could include the timeout in the dendrite configuration.
Here it would be a good idea to extend the type "Address" by a timeout and
create an http client for each service.

Closes #820

Signed-off-by: Benedikt Bongartz <benne@klimlive.de>

Co-authored-by: Kegsay <kegan@matrix.org>
2020-04-03 11:40:50 +01:00
Neil Alexander aebf347a79
Implement gomatrixserverlib.HeaderedEvent in roomserver Kafka output (#914)
* Use Event.Headered

* Use HeaderedEvent in roomserver kafka output

* Fix syncserver-integration-tests

* Update producers to roomserver inputs

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update gomatrixserverlib
2020-03-17 11:01:25 +00:00
Alex Chen 43308d2f3f
Associate transactions with session IDs instead of device IDs (#789) 2019-08-24 00:55:40 +08:00
ruben 74827428bd use go module for dependencies (#594) 2019-05-21 21:56:55 +01:00
Renamed from src/github.com/matrix-org/dendrite/roomserver/api/input.go (Browse further)