On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:
> On 03/12/2014 12:03 PM, Andres Freund wrote:
> > Hi,
> >
> > On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
> >> I was just reading Michael's explanation of replication slots
> >> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
> >> and realized there was something which had completely escaped me in the
> >> pre-commit discussion:
> >>
> >> select pg_drop_replication_slot('slot_1');
> >> ERROR: 55006: replication slot "slot_1" is already active
> >> LOCATION: ReplicationSlotAcquire, slot.c:339
> >>
> >> What defines an "active" slot?
> >
> > One with a connected walsender.
>
> In a world of network proxies, a walsender could be "connected" for
> hours after the replica has ceased to exist. Fortunately,
> wal_sender_timeout is changeable on a reload. We check for actual
> standby feedback for the timeout, yes?
Yep.
> >> It seems like there's no way for a DBA to drop slots from the master if
> >> it's rapidly running out of disk WAL space without doing a restart, and
> >> there's no way to drop the slot for a replica which the DBA knows is
> >> permanently offline but was connected earlier. Am I missing something?
> >
> > It's sufficient to terminate the walsender and then drop the slot. That
> > seems ok for now?
>
> We have no safe way to terminate the walsender that I know of;
> pg_terminate_backend() doesn't include walsenders last I checked.
SELECT pg_terminate_backend(pid) FROM pg_stat_replication;
works.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services